Isaku Yamahata [Tue, 6 Jan 2009 09:05:32 +0000 (18:05 +0900)]
[IA64] Fix some IPF Xen VT-d bugs.
In arch_domain_create(): when xen creates Dom0, need_iommu(d) is false,
so iommu_domain_init() is not invoked, as a result, eventually iommu is
not enabled properly.
Note: d->need_iommu is set to 1 only by assign_device() which is never
called for dom0. And it is called via XEN_DOMCTL_assign_device hypercall.
In IA64 Xen, physdev_map_pirq()/physdev_unmap_pirq() are kept dummy since
we don't support MSI in IA64 Xen now, but here they shouldn't return
-ENOSYS because xend invokes them (the x86 version of them is necessary
for x86 Xen); in IPF Xen if they return -ENOSYS, xend would disallow us
to create IPF HVM guest with devices assigned. Here They can return 0 instead.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Isaku Yamahata [Mon, 5 Jan 2009 05:13:38 +0000 (14:13 +0900)]
[IA64] fix ia64_fast_eoi hypercall to catch up PHYSDEVOP_pirq_eoi_gmfn
ia64 xen Linux uses ia64_fast_eoi to do eoi. So the c/s 18862:
f0a9a58608a0
should also have changed od_pir_guest_eoi() too.
This patch changes it.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Mon, 5 Jan 2009 03:24:58 +0000 (12:24 +0900)]
[IA64] fix mis-setting ed bit for itlb entry for hvm domain.
This patch fixes a windows BSOD issue caused by mis-setting pte's ED bit
for itlb entry.
For hash vTLB, it uses unified tlb and doesn't differentiate itc and dtc
in its implementation, so itlb_miss handler may reference dtlb entry in
hash vTLB.
But it may result in issues, because dtlb's ED bit may be different with
itlb's setting.
Since the case is very rare, so just purge the corresponding entry in hash
vTLB and let guest OS to determin how to set ED bit for itlb mapping once
found it.
Signed-off-by : Xiantao Zhang <xiantao.zhang@intel.com>
Isaku Yamahata [Mon, 5 Jan 2009 03:24:58 +0000 (12:24 +0900)]
[IA64] paravirtualize itc and support save/restore.
ia64 linux 2.6.18 only use ar.itc for local ticks so that
ar.itc didn't need paravirtualization and it can be work arounded
when save/restore.
However recent ia64 linux uses ar.itc for sched_clock() and
CONFIG_VIRT_CPU_ACCOUNTING and other issues. So ar.itc needs
paravirtualization. Although Most part is done in guest OS,
save/restore needs hypervisor support.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Mon, 5 Jan 2009 03:24:58 +0000 (12:24 +0900)]
[IA64] remove warning.
This patch removes the following warning.
> hypercall.c:205: warning: implicit declaration of function 'vmx_lazy_load_fpu'
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Wed, 24 Dec 2008 03:52:34 +0000 (12:52 +0900)]
merge with xen-unstable.hg
Isaku Yamahata [Wed, 24 Dec 2008 03:50:57 +0000 (12:50 +0900)]
[IA64]: Fix BUILD_BUG_ON().
This is ia64 counter part of
1419a73316e1.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Wed, 24 Dec 2008 03:50:55 +0000 (12:50 +0900)]
[IA64]: fix compilation error.
BUILD_BUG_ON() was changed so that now BUILD_BUG_ON() can't be
used with symbol values.
Fortunately dom_fpswa_hypercall_patch() isn't performance critical
so replace BUILD_BUG_ON() with BUG_ON().
Fixed the wrong condition which has off-by-one bug.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Keir Fraser [Mon, 22 Dec 2008 13:48:40 +0000 (13:48 +0000)]
i386: Fix the build.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 22 Dec 2008 13:43:13 +0000 (13:43 +0000)]
shadow: Remove warnings about writes to read-only BIOS area. These
attempts can be legitimate.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 22 Dec 2008 12:07:20 +0000 (12:07 +0000)]
Cleanup Intel CMCI support.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 22 Dec 2008 08:12:33 +0000 (08:12 +0000)]
Enable CMCI for Intel CPUs
Signed-off-by Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by Liping Ke <liping.ke@intel.com>
Keir Fraser [Fri, 19 Dec 2008 14:56:36 +0000 (14:56 +0000)]
Support S3 for MSI interrupt
From: "Jiang, Yunhong" <yunhong.jiang@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 19 Dec 2008 14:52:32 +0000 (14:52 +0000)]
Change the pcidevs_lock from rw_lock to spin_lock
As pcidevs_lock is changed from protecting only the alldevs_list to
more than that, it doesn't benifit too much from the rw_lock. Also the
previous patch 18906:
2941b1a97c60 is wrong to use read_lock to protect some
sensitive data (thanks Espen pointed out that).
Also two minor fix in this patch:
a) deassign_device will deadlock when try to get the pcidevs_lock if
called by pci_release_devices, remove the lock to the caller.
b) The iommu_domain_teardown should not ASSERT for the pcidevs_lock
because it just update the domain's vt-d mapping.
Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
Keir Fraser [Fri, 19 Dec 2008 14:44:40 +0000 (14:44 +0000)]
CPUIDLE: adjust cstate statistic interface
1. change unit of residency, PM ticks -> ns.
2. output C0 usage & residency.
Signed-off-by: Wei Gang <gang.wei@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 19 Dec 2008 13:42:04 +0000 (13:42 +0000)]
VT-d: Fix PCI-X device assignment
When assign PCI device, current code just map its bridge and its
secondary bus number and devfn 0. It doesn't work for PCI-x device
assignment, because the request may be the source-id in the original
PCI-X transaction or the source-id provided by the bridge. It needs to
map the device itself, and its upstream bridges till PCIe-to-PCI/PCI-x
bridge.
In addition, add description for DEV_TYPE_PCIe_BRIDGE and
DEV_TYPE_PCI_BRIDGE for understandability.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Thu, 18 Dec 2008 17:18:28 +0000 (17:18 +0000)]
xend: Actually restrict a domU's access to xenstore when we mean to --
this means that in some cases it cannot be owner of its own xenstore
nodes.
This bug was pointed out by Daniel Berrange at Red Hat. This patch is
my own more generic fix that automatically covers a range of callers
(albeit the patch is arguably a bit of a hack ;-).
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 18 Dec 2008 17:14:27 +0000 (17:14 +0000)]
x86: Quieten tracing in msi startup/shutdown handlers.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 18 Dec 2008 14:52:53 +0000 (14:52 +0000)]
rombios: Update to Bochs latest
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Keir Fraser [Thu, 18 Dec 2008 11:29:33 +0000 (11:29 +0000)]
xenoprof: Add support for Intel Dunnington cores.
Signed-off-by: Xiaowei Yang <Xiaowei.yang@intel.com>
Signed-off-by: Ting Zhou <ting.g.zhou@intel.com>
Keir Fraser [Thu, 18 Dec 2008 11:28:25 +0000 (11:28 +0000)]
x86, shadow: Avoid duplicates in fixup tables.
Avoid entering duplicates in fixup tables, reducing fixup evictions.
Signed-off-by: Gianluca Guida <gianluca.guida@eu.citrix.com>
Keir Fraser [Thu, 18 Dec 2008 11:27:37 +0000 (11:27 +0000)]
Fix mini-os ia64 compilation
- Avoid nested function to avoid a trampoline.
- Do not link mini-os_app.o when it is empty.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Keir Fraser [Wed, 17 Dec 2008 11:36:22 +0000 (11:36 +0000)]
x86, hvm: Don't ever call the shadow code to fix a page fault in an
external-mode guest if the fault came from Xen; it would be making
changes to the wrong pagetables, potentially causing a pagefault loop
in Xen.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Tue, 16 Dec 2008 13:14:25 +0000 (13:14 +0000)]
xenpm: add cpu frequency control interface, through which user can
tune the parameters manually.
Now, xenpm can be invoked with the following options:
Usage:
xenpm get-cpuidle-states [cpuid]: list cpu idle information on
CPU cpuid or all CPUs.
xenpm get-cpufreq-states [cpuid]: list cpu frequency
information on CPU cpuid or all CPUs.
xenpm get-cpufreq-para [cpuid]: list cpu frequency information
on CPU cpuid or all CPUs.
xenpm set-scaling-maxfreq <cpuid> <HZ>: set max cpu frequency
<HZ> on CPU <cpuid>.
xenpm set-scaling-minfreq <cpuid> <HZ>: set min cpu frequency
<HZ> on CPU <cpuid>.
xenpm set-scaling-governor <cpuid> <name>: set scaling governor
on CPU <cpuid>.
xenpm set-scaling-speed <cpuid> <num>: set scaling speed on CPU
<cpuid>.
xenpm set-sampling-rate <cpuid> <num>: set sampling rate on CPU
<cpuid>.
xenpm set-up-threshold <cpuid> <num>: set up threshold on CPU
<cpuid>.
To ease the use of this tool, the shortcut option is supported,
i.e. `xenpm get-cpui' is equal to `xenpm get-cpuidle-states'.
Signed-off-by: Guanqun Lu <guanqun.lu@intel.com>
Keir Fraser [Tue, 16 Dec 2008 12:04:13 +0000 (12:04 +0000)]
x86: Update xen-detect utility to scan for Xen signature in CPUID space.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 16 Dec 2008 12:00:25 +0000 (12:00 +0000)]
mini-os: Make utility function get_self_id() in fs-front.c public.
Signed-off-by: Yosuke Iwamatsu <y-iwamatsu@ab.jp.nec.com>
Keir Fraser [Tue, 16 Dec 2008 11:59:22 +0000 (11:59 +0000)]
x86: Simpler time handling when TSC is constant across all power saving states.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Signed-off-by: Gang Wei <gang.wei@intel.com>
Keir Fraser [Tue, 16 Dec 2008 11:54:11 +0000 (11:54 +0000)]
vmx: Do not disable real EFER.NXE even when disabled by guest.
We must not disable EFER.NXE in host mode since shadow code relies on
accessing shadow mappings with NX set.
We do not want to write EFER on every vmentry/vmexit if we can avoid
it, since it will be somewhat slow.
Finally, we don't believe that any guest relies on NX really being
disabled when EFER.NXE is cleared.
This given, it makes sense to ignore the guest's setting of EFER.NXE.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 16 Dec 2008 11:49:20 +0000 (11:49 +0000)]
x86: Enable MTF for HVM guest single step in gdb
Signed-off-by: Edwin Zhai <edwin.zhai@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 15 Dec 2008 11:37:14 +0000 (11:37 +0000)]
x86: Decode CPUID for TSC guarantees.
Signed-off-by: Wei Gang <gang.wei@intel.com>
Keir Fraser [Mon, 15 Dec 2008 11:23:22 +0000 (11:23 +0000)]
rombios: fix references to EBDA
Extended Bios Data Area (EBDA) can be relocated by the initialization
of PCI option ROM. The IPL boot table is also.
EBDA must be accessed via 0x40E after the initialization.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Mon, 15 Dec 2008 11:17:14 +0000 (11:17 +0000)]
x86: Small cleanups to time handling.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 13 Dec 2008 17:44:20 +0000 (17:44 +0000)]
xenpmd: Fix bogus fgets() size parameter.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 13 Dec 2008 15:56:16 +0000 (15:56 +0000)]
x86: Clean up and simplify rwlock implementation.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 13 Dec 2008 15:28:10 +0000 (15:28 +0000)]
Clean up use of spin_is_locked() and introduce rw_is_locked().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 13 Dec 2008 15:04:53 +0000 (15:04 +0000)]
xc_pm: Fix off-by-one error in string array access.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Keir Fraser [Sat, 13 Dec 2008 15:02:55 +0000 (15:02 +0000)]
x86: Fix early time initialisation after recent changes.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Isaku Yamahata [Fri, 12 Dec 2008 01:43:39 +0000 (10:43 +0900)]
IA64: quieten PV fp fault/trap handler.
Now fp fault/trap is handled correctly except the case fpswa
returns error. So quieten the handler.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Fri, 12 Dec 2008 01:36:23 +0000 (10:36 +0900)]
IA64: fix panic caused by daccess fault.
While fpswa emulation, Xen VMM access guest virtual address space
which may cause daccess fault resulting in panic.
This patch make daccess fault handler handle such cases properly.
(XEN) Xen BUG at faults.c:583
(XEN) FIXME: implement ia64 dump_execution_state()
(XEN)
(XEN) Call Trace:
(XEN) [<
f4000000040fe360>] show_stack+0x90/0xb0
(XEN) sp=
f0000002b6067940 bsp=
f0000002b6061860
(XEN) [<
f4000000040fee70>] dump_stack+0x30/0x50
(XEN) sp=
f0000002b6067b10 bsp=
f0000002b6061840
(XEN) [<
f4000000040640d0>] __bug+0x70/0xa0
(XEN) sp=
f0000002b6067b10 bsp=
f0000002b6061810
(XEN) [<
f4000000040b53b0>] ia64_handle_reflection+0x60/0x13b0
(XEN) sp=
f0000002b6067b10 bsp=
f0000002b60617b8
(XEN) [<
f4000000040f5b40>] ia64_leave_kernel+0x0/0x300
(XEN) sp=
f0000002b6067b20 bsp=
f0000002b60617b8
(XEN) [<
f4000000040c3a20>] __get_domain_bundle+0x0/0x40
(XEN) sp=
f0000002b6067d20 bsp=
f0000002b6061778
(XEN) [<
f4000000040bee20>] vcpu_get_domain_bundle+0xb0/0xa10
(XEN) sp=
f0000002b6067d20 bsp=
f0000002b60616e8
(XEN) [<
f4000000040b3f20>] handle_fpu_swa+0x360/0x4a0
(XEN) sp=
f0000002b6067d60 bsp=
f0000002b6061660
(XEN) vcpu.c:1371: vcpu_get_domain_bundle gip 0x40000000000008a0
(XEN) [<
f4000000040b5e90>] ia64_handle_reflection+0xb40/0x13b0
(XEN) sp=
f0000002b6067df0 bsp=
f0000002b6061610
(XEN) vcpu.c:1371: vcpu_get_domain_bundle gip 0x4000000000000730
(XEN) faults.c:343:d6 handle_fpu_swa(fault): floating-point bundle at 0x4000000000000730 not mapped
(XEN) [<
f4000000040f5b40>] ia64_leave_kernel+0x0/0x300
(XEN) sp=
f0000002b6067e00 bsp=
f0000002b6061610
(XEN) vcpu.c:1371: vcpu_get_domain_bundle gip 0x40000000000008a0
(XEN) faults.c:343:d6 handle_fpu_swa(fault): floating-point bundle at 0x40000000000008a0 not mapped
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 15:
(XEN) Xen BUG at faults.c:583
(XEN) ****************************************
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Fri, 12 Dec 2008 01:35:58 +0000 (10:35 +0900)]
IA64: make the fpswa emulation keep the previous behaviour.
When fpswa library return statue > 0, keep the previous behavior.
This case should be addressed somehow later, but it seems somewhat
difficult to resolve, so keep the previous behavor for now.
It is assumed that a guest kernel calls fpswa library
without preemption. This assumption breaks if a guest kernel is
preemptive.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Fri, 12 Dec 2008 01:34:18 +0000 (10:34 +0900)]
IA64: fix fp fault/trap handler.
This patch is a part of fixes to bug reported as
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1392
When fpswa handler fails to get a bundle in guest,
fp fault/trap should be injected into the guest and let a guest
to handle it.
When the fpswa library return a error, there is no way to
pass the value to the guest. In that case, just inject fpswa
fault/trap into a guest running a risk that guest may get
error with their own fpswa call. Here it is assumed that
no applications depend on SIGFP process signal to recover
their computation.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Fri, 12 Dec 2008 01:29:15 +0000 (10:29 +0900)]
IA64: fix emulation of fp emulation in pv domain
This patch is a part of fixes to bug reported as
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1392
When vmm fails to get a bundle in a question during fpswa processing,
there is no way, but a guest provides the bundle.
On the other hand the current implementation just returns random value.
This patch make the fpswa hypercall calling convention complicated and
pass necessary informations to the hypervisor.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Keir Fraser [Thu, 11 Dec 2008 22:32:20 +0000 (22:32 +0000)]
xentop: Fix fprintf() build failure.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 11 Dec 2008 13:26:02 +0000 (13:26 +0000)]
hvmloader: enable bus mastering of PCI device
Without this, init routine in some PCI option ROM doesn't work well.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Thu, 11 Dec 2008 13:25:28 +0000 (13:25 +0000)]
x86: enable interrupts explicitly in __start_xen()
Instead of relying on smp_prepare_cpus() (via check_nmi_watchdog()) or
init_xen_time() (via init_platform_timer() -> plt_overflow())
implicitly enabling interrupts, enable them explicitly once safe to do
so (it may actually be possible to move this even further up, but I
don't think that would buy us much).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Also move spin_debug_enable() a bit higer. Moving it above
smp_prepare_cpus() didn't work for some reason though!
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 11 Dec 2008 13:10:19 +0000 (13:10 +0000)]
x86: Clean up early time setup.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 11 Dec 2008 13:09:59 +0000 (13:09 +0000)]
vga: Only vga_endboot() if vga_init() completed.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 11 Dec 2008 11:49:37 +0000 (11:49 +0000)]
x86, cpufreq: Change cpufreq_driver->get so that it can get other
cpu's real physical freq.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Keir Fraser [Thu, 11 Dec 2008 11:48:19 +0000 (11:48 +0000)]
Re-enable MSI support
Currently the MSI is disabled because of some lock issue. This patch
tries to clean up the locking related to MSI lock.
Signed-off-by: Jiang Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Thu, 11 Dec 2008 11:40:10 +0000 (11:40 +0000)]
x86: fix the potential of encountering panic "IO-APIC + timer doesn't work! ..."
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Linux commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=
4aae07025265151e3f7041dfbf0f529e122de1d8
x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
Under rare circumstances we found we could have an IRQ0 entry while we
are in the middle of setting up the local APIC, the i8259A and the
PIT. That is certainly not how it's supposed to work! check_timer()
was supposed to be called with irqs turned off - but this eroded away
sometime in the past. This code would still work most of the time
because this code runs very quickly, but just the right timing
conditions are present and IRQ0 hits in this small, ~30 usecs window,
timer irqs stop and the system does not boot up. Also, given how early
this is during bootup, the hang is very deterministic - but it would
only occur on certain machines (and certain configs).
The fix was quite simple: disable/restore interrupts properly in this
function. With that in place the test-system now boots up just fine.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Keir Fraser [Thu, 11 Dec 2008 11:36:00 +0000 (11:36 +0000)]
x86: unify local_irq_XXX()
This also removes an inconsistency in that x86-64's __save_flags() had
a memory clobber, while x86_32's didn't.
It further adds type checking since blindly using {pop,push}{l,q} on a
memory operand of unknown size bares the risk of corrupting other
data.
Finally, it eliminates the redundant (with local_irq_restore())
__restore_flags() macro and renames __save_flags() to
local_save_flags(), making the naming consistent with Linux (again?).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Thu, 11 Dec 2008 11:32:39 +0000 (11:32 +0000)]
rombios: fix rom_scan (ja->jmp)
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Thu, 11 Dec 2008 11:30:13 +0000 (11:30 +0000)]
Fix a typo caused by 18898.
new state is updated too early.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Keir Fraser [Thu, 11 Dec 2008 11:27:49 +0000 (11:27 +0000)]
cpufreq: Short path avoiding IPI in critical fast path.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 11 Dec 2008 11:19:27 +0000 (11:19 +0000)]
libxc: Fix xc_pm.c build by avoiding bogus header includes.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 11 Dec 2008 11:19:01 +0000 (11:19 +0000)]
Fix BUILD_BUG_ON()
As was noticed on the Linux side, using an array here isn't appropriate
if the condition is not a compile time constant - gcc allows such
arrays, and hence the intended effect of producing a compiler error is
not achieved in that case. Bit field widths do not know similar
language extensions, and hence always produce a compiler error.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 10 Dec 2008 14:05:41 +0000 (14:05 +0000)]
Avoid negative runstate pieces.
Also consolidate all places to get cpu idle time.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 10 Dec 2008 13:41:34 +0000 (13:41 +0000)]
Initialize state_entry_time to zero for all idle vcpus
NOW() is not usable since xen time sub-system hasn't
been initialized yet. On my box, it gives a initial
stamp ~60s due to local tsc stamp as zero and TSC
count is started from power on. Then a negative value
is added to runstate of that idle vcpu at schedule
point. The net effect is for some tool like xenpm
to show a big idle time gap between BSP and other APs.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 10 Dec 2008 13:30:10 +0000 (13:30 +0000)]
x86: Make MCE panic message more obvious
Make it more obvious to the untrained user that machine check reboots
are hardware faults, rather then just saying "CPU context corrupt".
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Wed, 10 Dec 2008 13:28:58 +0000 (13:28 +0000)]
gdbserver: Fix build failure.
From: Edwin Zhai <edwin.zhai@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 10 Dec 2008 13:27:41 +0000 (13:27 +0000)]
Add user PM control interface
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Keir Fraser [Wed, 10 Dec 2008 13:27:14 +0000 (13:27 +0000)]
Add cpufreq governors: performance, powersave, userspace
This patch add 3 more governors beside original running ondemand
cpufreq governor.
performance governor is with best performance, keeping cpu always
running at highest freq;
powersave governor is with best power save effect, keeping cpu always
running at lowest freq;
userspace governor provide user setting freq ability;
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Keir Fraser [Wed, 10 Dec 2008 13:14:13 +0000 (13:14 +0000)]
libxc: Fix memory leak in zlib usage
Any call to inflate() must be followed by inflateEnd(), otherwise the
internal zlib state is leaked.
Signed-off-by: Kevin Wolf <kwolf@suse.de>
Isaku Yamahata [Wed, 10 Dec 2008 06:39:47 +0000 (15:39 +0900)]
IA64: fix efi_emulate_set_virtual_address_map()
get_page() before touching guest pages.
Otherwise pages may be freed during those operations.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Wed, 10 Dec 2008 06:39:46 +0000 (15:39 +0900)]
IA64: improve handle_fpu_swa()
It tries to get a bundle in guest.
Make it more robust using vmx_get_domain_bundle() instead of
__get_domain_bundle().
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Wed, 10 Dec 2008 06:39:44 +0000 (15:39 +0900)]
IA64: use symbolic constant for hypercall.
define symbolic names for hypercall number and use them.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Keir Fraser [Tue, 9 Dec 2008 16:28:02 +0000 (16:28 +0000)]
Use virtual 8086 mode for VMX guests with CR0.PE == 0
When a VMX guest tries to enter real mode, put it in virtual 8086 mode
instead, if that's possible. Handle all errors and corner cases by
falling back to the real-mode emulator.
This is similar to the old VMXASSIST system except it uses Xen's
x86_emulate emulator instead of having a partial emulator in the guest
firmware. It more than doubles the speed of real-mode operation on
VMX.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Tue, 9 Dec 2008 13:23:15 +0000 (13:23 +0000)]
vga: Fix screen clear at end of Xen bootstrap.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 9 Dec 2008 13:06:19 +0000 (13:06 +0000)]
pv-on-hvm: add pvSCSI frontend
Signed-off-by: Tomonari Horikoshi <t.horikoshi@jp.fujitsu.com>
Signed-off-by: Jun Kamada <kama@jp.fujitsu.com>
Keir Fraser [Tue, 9 Dec 2008 13:00:52 +0000 (13:00 +0000)]
pv-on-hvm: fix for Centos 5.2
From: Yoshisato YANAGISAWA <yanagisawa.yoshisato@lab.ntt.co.jp>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 9 Dec 2008 12:55:29 +0000 (12:55 +0000)]
VT-d: check return value of pirq_guest_bind()
The eliminates a hypervisor crash when the respective domain dies or
gets the device hot removed.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Reviewed-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Tue, 9 Dec 2008 12:53:19 +0000 (12:53 +0000)]
tools: Fix a few error-path memory leaks.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 9 Dec 2008 12:45:45 +0000 (12:45 +0000)]
xend: Remember bootable flag for vbds in xenstore
When xend is restarted, bootable flags of all disk devices are lost
and then the first disk is marked as bootable by a "compatibility
hack". When a guest domain is created with a mixture of several vbd
and tap devices, the compatibility hack may fail to choose the right
bootable device. Thus preventing the guest to be restarted. This patch
fixes this behavior by remembering bootable flag for each disk device
in xenstore database.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Keir Fraser [Tue, 9 Dec 2008 12:44:32 +0000 (12:44 +0000)]
xend: Fix memory allocation bug after hvm reboot in numa system
Recently we find a bug on Nahelem machine (totally with two nodes, 6G
memory (3G in each node):
- Start a HVM guest with its all VCPUS pinned to node1, so all its
memory is allocated from node1.
- Reboot the HVM.
- There will be some memory allocated from node0 even there is enough
free memory on node1.
Reason: For security issues, xen will not put all the pages of a dying
hvm to domheap directly, but put them in scrub list and wait for handled
by page_scrub_softirq(). If the dying hvm have a lot of memory,
page_scrub_softirq() will not handle all of them before the start the
hvm. There are some pages belong to node1 still in scrub list, new hvm
can't use pages in it. So this hvm will get different memory
distribution than before. Before changeset 18304, page_scrub_softirq()
can be excuted parallel between all the cpus. Changeset 18305
serialise page_scrub_softirq() and Changeset 18307 serialise
page_scrub_softirq() with a new lock to avoid holding up acquiring
page_scrub_lock in free_domheap_pages(). Those changeset slow the ability
to handle pages in scrub list. So the bug becomes more obvious after.
Patch: This patch modifiers balloon.free to avoid this bug. After
patch, balloon.free will check whether current machine is a numa
system and the new created hvm has all its vcpus in the same node. If
all the conditions above fit, we will wait until all the pages in
scrub list are freed (if waiting time go beyond 20s, we will stop
waiting it.).
This seems to be too restricted at the first glance. We used to only
wait for the free memory size of pinned node is bigger than
required. But as we know HVM memory alloction granularity is 2M. Even
the former condition is satisfied, we still may not find enough
2M-size memory on that node.
Signed-off-by: Ting Zhou <ting.g.zhou@intel.com>
Signed-off-by: Xiaowei Yang <Xiaowei.yang@intel.com>
Keir Fraser [Tue, 9 Dec 2008 12:42:18 +0000 (12:42 +0000)]
libxc: Fix gcc 4.3 build failure
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Tue, 9 Dec 2008 12:41:12 +0000 (12:41 +0000)]
rombios: support BCV
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Fri, 5 Dec 2008 15:54:22 +0000 (15:54 +0000)]
Fix domain save when guest is in S3.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 5 Dec 2008 15:24:12 +0000 (15:24 +0000)]
x86: make an error message more precise
... allowing to distinguish whether the to be added or the already
existing PIRQ binding is causing the failure.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Fri, 5 Dec 2008 15:23:32 +0000 (15:23 +0000)]
cpufreq: allow customization of some parameters
Short of having a way for powersaved to dynamically adjust these
values, at least allow specifying them on the command line. In
particular, always running at an up-threshold of 80% is perhaps nice
for laptop use, but certainly not desirable on servers. On shell
scripts invoking large numbers of short-lived processes I noticed a
50% performance degradation on a dual-socket quad-core Barcelona just
because of the load of an individual core never crossing the 80%
boundary that would have resulted in increasing the frequency.
(Powersaved on SLE10 sets this on native kernels to 60% or 80%,
depending on whether performance or power reduction is preferred,
*divided* by the number of CPUs, but capped at the lower limit of
20%.)
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Fri, 5 Dec 2008 15:22:43 +0000 (15:22 +0000)]
x86/cpufreq: reduce verbosity
These messages don't exist in powernow's equivalent code, and are
pretty useless anyway, hence just cluttering the logs.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Fri, 5 Dec 2008 15:22:21 +0000 (15:22 +0000)]
powernow: implement struct cpufreq_driver.verify
Without this, under rare conditions hypervisor crashes are possible
due to this method being called without checking against NULL.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Fri, 5 Dec 2008 15:21:59 +0000 (15:21 +0000)]
x86/32on64: adjust address when converting syscall to fault
The faulting address is at the start of the syscall instruction rather
than at the following one.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Fri, 5 Dec 2008 14:46:38 +0000 (14:46 +0000)]
x86, time: Fix scale_reciprocal().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 5 Dec 2008 13:06:57 +0000 (13:06 +0000)]
minios: Clip memory not usable by Mini-OS (above 1GB)
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Keir Fraser [Fri, 5 Dec 2008 13:03:44 +0000 (13:03 +0000)]
cpuidle: revise tsc-save/restore to reduce tsc skew between cpus
Signed-off-by: Wei Gang <gang.wei@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 5 Dec 2008 11:37:20 +0000 (11:37 +0000)]
vga: Clear the screen when relinquishing VGA to dom0.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 5 Dec 2008 11:05:45 +0000 (11:05 +0000)]
xentrace: trace interrupt window
Make a specific interrupt-window trace, with information about why the
interrupt in question can't be delivered.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Fri, 5 Dec 2008 10:59:41 +0000 (10:59 +0000)]
VT-d code cleanup
This patch narrow context caching flush range from the
domain-selective to the device-selective, when unmapping a device.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Isaku Yamahata [Fri, 5 Dec 2008 06:47:19 +0000 (15:47 +0900)]
merge with xen-unstable.hg
Isaku Yamahata [Fri, 5 Dec 2008 06:43:08 +0000 (15:43 +0900)]
IA64: implement PHYSDEVOP_pirq_eoi_gmfn and related stuff.
This patch is ia64 counter part of 18844:
c820bf73a914.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Isaku Yamahata [Fri, 5 Dec 2008 06:43:06 +0000 (15:43 +0900)]
IA64: eliminate NR_IRQ_VECTORS. ia64 part.
This is ia64 counter part of 18802:
935bd48f096a which eliminates
NR_IRQ_VECTORS.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Keir Fraser [Thu, 4 Dec 2008 16:36:43 +0000 (16:36 +0000)]
docs: Add description of BUILD_BUG_ON().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 4 Dec 2008 14:12:08 +0000 (14:12 +0000)]
Fix one timer range issue
According to the timer sematic, the timer can be executed at any timer
within [expires, expires_end], however, current implementation only allow
timer to be executed after expires_end, which is not conform to the timer
semantics.
This patch fix the the SPECpower score regression (~5% downgrade)
introduced by changeset 18744 "Change timer implementation to allow
variable 'slop'"
Signed-off-by: Yu Ke <ke.yu@intel.com>
Keir Fraser [Thu, 4 Dec 2008 12:35:22 +0000 (12:35 +0000)]
New document on error handling in Xen.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 4 Dec 2008 11:36:18 +0000 (11:36 +0000)]
Fix existence check for MMIO-mapped 16550 UARTs
Changeset
982e6fce0e47 added an existence test for UARTs.
Unfortunately, the existence test happens before MMIO UARTs are
ioremapped, therefore it may not be probing where it thinks it's
probing. Rather than moving more code around, I think it's probably
safe to assume the arch code knows what it's doing if it passes in an
MMIO UART.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Keir Fraser [Thu, 4 Dec 2008 11:32:43 +0000 (11:32 +0000)]
xm: Fix xm block-list for inactive managed domains
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Thu, 4 Dec 2008 11:31:37 +0000 (11:31 +0000)]
xend: Remember bootloader settings in xenstore
When xend is restarted, bootloader settings of all running domains are
lost. The attached patches fixes this by saving bootloader and
bootloader_args to xenstore database.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Isaku Yamahata [Thu, 4 Dec 2008 02:01:53 +0000 (11:01 +0900)]
merge with xen-unstable.hg
Keir Fraser [Wed, 3 Dec 2008 15:58:23 +0000 (15:58 +0000)]
xentop: Fix xentop for blktap
Blktap devices information isn't shown by xentop currently.
xen-unstable c/s 17813 said "blktap devices have statistics
counters (e.g., rd_req, wr_req, oo_req) prepended by tap_".
In fact, it is as follows.
# ls -l /sys/devices/xen-backend/tap-1-769/statistics/
total 0
-r--r--r-- 1 root root 4096 Dec 3 20:37 oo_req
-r--r--r-- 1 root root 4096 Dec 3 20:37 rd_req
-r--r--r-- 1 root root 4096 Dec 3 20:37 rd_sect
-r--r--r-- 1 root root 4096 Dec 3 20:37 wr_req
-r--r--r-- 1 root root 4096 Dec 3 20:37 wr_sect
The statistics counters haven't had "tap_" because it was removed
by linux-2.6.18-xen c/s 34.
This patch reverts xen-unstable c/s 17813, then we can get the
blktap devices information by using xentop.
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Wed, 3 Dec 2008 15:56:33 +0000 (15:56 +0000)]
AMD IOMMU: Invalidate all pages on domain destruction
Attached patch adds support to invalidate all pages associated with
the same domain ID on domain destruction.
Signed-off-by: Wei Wang <wei.wang2@amd.com>