xen.git
16 years agox86: Poison initmem at end of Xen bootstrap
Keir Fraser [Mon, 5 Jan 2009 11:55:24 +0000 (11:55 +0000)]
x86: Poison initmem at end of Xen bootstrap

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agovmx: Print advanced features during boot
Keir Fraser [Mon, 5 Jan 2009 11:52:34 +0000 (11:52 +0000)]
vmx: Print advanced features during boot
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoDownload external tarballs from xenbits.xensource.com
Keir Fraser [Mon, 5 Jan 2009 11:19:16 +0000 (11:19 +0000)]
Download external tarballs from xenbits.xensource.com

I have copied the tarballs that the xen-unstable build downloads to
xenbits.xensource.com (which also hosts our hg and git).  This patch
changes the download URLs to use that location.

That way the build will depend on only one external machine, under one
administration, rather than many.  Also it means that the build won't
break if these sites become permanently unavailable or are rearranged
and we don't run a risk of having to panic and beg if a file should go
missing.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
16 years agoCpufreq: simplify cpufreq_statistic_lock init
Keir Fraser [Mon, 5 Jan 2009 11:16:41 +0000 (11:16 +0000)]
Cpufreq: simplify cpufreq_statistic_lock init

Singed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
16 years agoCpufreq: prevent negative px resident time, add spinlock to avoid race
Keir Fraser [Mon, 5 Jan 2009 11:16:12 +0000 (11:16 +0000)]
Cpufreq: prevent negative px resident time, add spinlock to avoid race

Due to NOW() value may drift between different cpus, we add protection
to prevent negative px resident time.
Due to both cpufreq logic and xenpm may race accessing
cpufreq_statistic_data, we add spinlock to avoid race.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
16 years agoCpufreq: remove redundant fragments
Keir Fraser [Mon, 5 Jan 2009 11:15:40 +0000 (11:15 +0000)]
Cpufreq: remove redundant fragments

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
16 years agovtd hotplug: check if a device can be hot-plugged.
Keir Fraser [Mon, 5 Jan 2009 11:14:18 +0000 (11:14 +0000)]
vtd hotplug: check if a device can be hot-plugged.

When we statically assign a pci device (the pci=3D['xx:xx.x'] string
in guest config file) to guest, we make many checkings (for instance,
if the device is specified in 'pciback.hide', if it has
non-page-aligned MMIO BARs, if it has a proper FLR capability, if the
related devices should be co-assigned). However, with respect to the
guest hotplug, we only check if the device exists and not assigned yet
-- this is not enough, for instance, now xend allows us to assign an
in-use device (being used by Dom0) to an HVM guest (because
xc.test_assigned() returns OK) -- this will cause disaster... The
patch adds some necessary checkings.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
16 years agovtd: avoid redundant context mapping
Keir Fraser [Mon, 5 Jan 2009 11:13:22 +0000 (11:13 +0000)]
vtd: avoid redundant context mapping

After changeset 18934 (VT-d: Fix PCI-X device assignment), my assigned
PCI E1000 NIC doesn't work in guest.

The NIC is 03:00.0. Its parent bridge is: 00:1e.0.
In domain_context_mapping():
   case DEV_TYPE_PCI:
   After we domain_context_mapping_one() 03:00.0 and 00:1e.0, the
   'secbus' is 3 and 'bus' is 0,  so we domain_context_mapping_one()
   03:00.0 again -- this redundant invocation returns -EINVAL because
   we have created the mapping but haven't changed pdev->domain from
   Dom0 to a new domain at this time and eventually the
   XEN_DOMCTL_assign_device hypercall returns a failure.

The attached patch detects this case and avoids the redundant
invocation.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
16 years agoxenctx: compat-mode/HVM support
Keir Fraser [Mon, 5 Jan 2009 11:08:53 +0000 (11:08 +0000)]
xenctx: compat-mode/HVM support

Add support to xenctx for guests that have a different word size
to the tools (x86 only, but shouldn't break ia64).  Again, only 32-bit
HVM guests are supported until EFER.LMA is easier to get at.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
16 years agoMake xc_translate_foreign_address aware of compat-mode guests and
Keir Fraser [Mon, 5 Jan 2009 11:08:25 +0000 (11:08 +0000)]
Make xc_translate_foreign_address aware of compat-mode guests and
(32-bit) HVM guests.  64-bit HVM guests are still not supported for
now, pending a sensible way of getting at the guest's EFER.LMA.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
16 years agop2m: Small audit fixes.
Keir Fraser [Mon, 5 Jan 2009 10:47:51 +0000 (10:47 +0000)]
p2m: Small audit fixes.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 9/9: xend integration
Keir Fraser [Mon, 5 Jan 2009 10:47:03 +0000 (10:47 +0000)]
PoD memory 9/9: xend integration

Xend integration for PoD functionality.
* Add python bindings for xc_hvm_domain_build() and
xc_domain_memory_set_pod_target()
* Always call xc_hvm_domain_build(), with memsize = memory_static_max
and target=memory_dynamic_max
* When setting a new memory target:
 + First make sure we actually have enough free memory for the target
setting to succeed
 + Call set_pod_target() with the new target, to Xen can do the Right
 Thing.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 8/9: libxc interface
Keir Fraser [Mon, 5 Jan 2009 10:46:37 +0000 (10:46 +0000)]
PoD memory 8/9: libxc interface

Implement libxc interface to PoD functionality:
* Add xc_hvm_build_target_mem(), which takes both memsize and target.
Memsize is the total memory, allocated in PoD pages and reported in
the e820; target is the size of the cache.  If these are the same, the
normal funcitonality is called.  (So you can use the same function to
build always, and it will decide whether to use PoD or not.)
* Add xc_domain_memory_[gs]et_pod_target(), which sets and/or returns
information about the PoD cache and p2m entries.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 7/9: Xen interface
Keir Fraser [Mon, 5 Jan 2009 10:45:48 +0000 (10:45 +0000)]
PoD memory 7/9: Xen interface

Implement Xen interface to PoD functionality.
* Increase the number of MEMOP bits from 4 to 6 (increasing the number
of available memory operations from 16 to 64).
* Introduce XENMEMF_populate_on_demand, which will cause
populate_physmap() to fill a range with PoD entries rather than
backing it with ram
* Introduce XENMEM_[sg]et_pod_target operation to the memory
hypercall, to get and set PoD cache size.  set_pod_target() should be
called during domain creation, as well as after modifying the memory
target of any domain which may have outstanding PoD entries.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 6/9: superpage splintering
Keir Fraser [Mon, 5 Jan 2009 10:45:09 +0000 (10:45 +0000)]
PoD memory 6/9: superpage splintering

Deal with splintering superpages in the PoD cache, and with
splintering superpage PoD entries in the p2m table.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 5/9: emergency scan
Keir Fraser [Mon, 5 Jan 2009 10:44:39 +0000 (10:44 +0000)]
PoD memory 5/9: emergency scan

Implement "emergency scan" for zero pages, to deal with start-of-day
page scrubbers.

If the cache is running out, scan through memory looking for "zero
pages" that we can reclaim for the cache.  This is necessary for
operating systems which have a start-of-day page scrubber which runs
before the balloon driver can balloon down to the target.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 4/9: Decrease reservation
Keir Fraser [Mon, 5 Jan 2009 10:43:50 +0000 (10:43 +0000)]
PoD memory 4/9: Decrease reservation

Handle balloon driver's calls to decrease_reservation properly.
* Replace PoD entries with p2m_none
* Steal memory for the cache instead of freeing, if need be

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 3/9: PoD core
Keir Fraser [Mon, 5 Jan 2009 10:43:19 +0000 (10:43 +0000)]
PoD memory 3/9: PoD core
X-BeenThere: xen-devel@lists.xensource.com
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Xen developer discussion <xen-devel.lists.xensource.com>
List-Unsubscribe:
<http://lists.xensource.com/mailman/listinfo/xen-devel>,
        <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe:
<http://lists.xensource.com/mailman/listinfo/xen-devel>,
        <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
Return-Path: xen-devel-bounces@lists.xensource.com
X-OriginalArrivalTime: 23 Dec 2008 13:47:03.0625 (UTC)
FILETIME=[EFEBC390:01C96504]

Core of populate-on-demand functionality:
* Introduce a populate-on-demand type
* Call p2m_demand_populate() when gfn_to_mfn() encounters PoD entries
* Return p2m memory to the domain list for freeing during domain destruction
* Audit p2m checks our PoD-entry reference-counting
* Add PoD information to the 'q' debug key

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD memory 2/9: calls to gfn_to_mfn_query()
Keir Fraser [Mon, 5 Jan 2009 10:42:39 +0000 (10:42 +0000)]
PoD memory 2/9: calls to gfn_to_mfn_query()

Shadow code, and other important places, call gfn_to_mfn_query().  In
particular, any place that holds the shadow lock must make a query
call.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoPoD (populate-on-demand) memory 1/9: Add a p2m query type.
Keir Fraser [Mon, 5 Jan 2009 10:41:48 +0000 (10:41 +0000)]
PoD (populate-on-demand) memory 1/9: Add a p2m query type.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
16 years agoacpi: Reserve IO ports used by hotplug
Keir Fraser [Mon, 29 Dec 2008 14:13:07 +0000 (14:13 +0000)]
acpi: Reserve IO ports used by hotplug

In DSDT, reserve the IO port ranges: [0x10c0, 0x10c2] and [0xb044,
0xb047] that are used by the virtual PCI hotplug.
Or else, for a hotplugged-in device, the port IO BAR assigned by guest
OS may conflict with the ranges here.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
16 years agocpufreq: xen is default cpufreq, userspace is default governor (override on cmdline)
Keir Fraser [Mon, 29 Dec 2008 14:08:46 +0000 (14:08 +0000)]
cpufreq: xen is default cpufreq, userspace is default governor (override on cmdline)

Set userspace governor as default, which stays same effect
as when cpufreq in xen is not enabled. As a result, enable cpufreq
in xen by default to avoid reboot to activate cpufreq. Now it's
always on but w/o performance impact if user doesn't attempt
to change governor.

Add governor option at cmdline, add some warning info for debug.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
16 years agox86, vmx: Fix single step on debugger
Keir Fraser [Mon, 29 Dec 2008 14:05:26 +0000 (14:05 +0000)]
x86, vmx: Fix single step on debugger

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
16 years agox86, mce: Fix x86_mcinfo_getptr is called when no error found
Keir Fraser [Mon, 29 Dec 2008 14:03:26 +0000 (14:03 +0000)]
x86, mce: Fix x86_mcinfo_getptr is called when no error found

The machine_check_poll() is called with mi which is set by
x86_mcinfo_getptr() everytime. But, I think it should not be called
when there is no error, because error_idx and fetch_idx cannot work
together.

Signed-off-by: Kazuhiro Suzuki <kaz@jp.fujitsu.com>
16 years agorombios: disable DEBUG_ROMBIOS by default.
Keir Fraser [Mon, 29 Dec 2008 14:00:45 +0000 (14:00 +0000)]
rombios: disable DEBUG_ROMBIOS by default.

Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
16 years agorombios: pass BDF correctly during option ROM scan
Keir Fraser [Mon, 29 Dec 2008 14:00:15 +0000 (14:00 +0000)]
rombios: pass BDF correctly during option ROM scan

Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
16 years agorombios: enabling option ROM write access during initialisation.
Keir Fraser [Mon, 29 Dec 2008 13:55:59 +0000 (13:55 +0000)]
rombios: enabling option ROM write access during initialisation.

Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
16 years agocpufreq: Fix a cpufreq cmdline parse bug, and change sample_rate unit
Keir Fraser [Mon, 29 Dec 2008 13:37:46 +0000 (13:37 +0000)]
cpufreq: Fix a cpufreq cmdline parse bug, and change sample_rate unit

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
16 years agox86: Do not restrict 32-bit EPT to 4GB.
Keir Fraser [Mon, 29 Dec 2008 13:32:32 +0000 (13:32 +0000)]
x86: Do not restrict 32-bit EPT to 4GB.

Signed-off-by: Xin, Xiaohui <xiaohui.xin@intel.com>
16 years agox86, intel: Clear Error counter field when set new cmci owner
Keir Fraser [Mon, 29 Dec 2008 13:30:14 +0000 (13:30 +0000)]
x86, intel: Clear Error counter field when set new cmci owner

Since cmci might happened when cpu is taking down (cpu hotplug) before
setting new cmci owner while old owner is down. We need to clear the
corrected error counter field to make sure CMCI could be triggered on
the new owner.

Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: Liping Ke <liping.ke@intel.com>
17 years agoi386: Fix the build.
Keir Fraser [Mon, 22 Dec 2008 13:48:40 +0000 (13:48 +0000)]
i386: Fix the build.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoshadow: Remove warnings about writes to read-only BIOS area. These
Keir Fraser [Mon, 22 Dec 2008 13:43:13 +0000 (13:43 +0000)]
shadow: Remove warnings about writes to read-only BIOS area. These
attempts can be legitimate.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoCleanup Intel CMCI support.
Keir Fraser [Mon, 22 Dec 2008 12:07:20 +0000 (12:07 +0000)]
Cleanup Intel CMCI support.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoEnable CMCI for Intel CPUs
Keir Fraser [Mon, 22 Dec 2008 08:12:33 +0000 (08:12 +0000)]
Enable CMCI for Intel CPUs

Signed-off-by Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by Liping Ke <liping.ke@intel.com>

17 years agoSupport S3 for MSI interrupt
Keir Fraser [Fri, 19 Dec 2008 14:56:36 +0000 (14:56 +0000)]
Support S3 for MSI interrupt

From: "Jiang, Yunhong" <yunhong.jiang@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoChange the pcidevs_lock from rw_lock to spin_lock
Keir Fraser [Fri, 19 Dec 2008 14:52:32 +0000 (14:52 +0000)]
Change the pcidevs_lock from rw_lock to spin_lock

As pcidevs_lock is changed from protecting only the alldevs_list to
more than that, it doesn't benifit too much from the rw_lock. Also the
previous patch 18906:2941b1a97c60 is wrong to use read_lock to protect some
sensitive data (thanks Espen pointed out that).

Also two minor fix in this patch:
a) deassign_device will deadlock when try to get the pcidevs_lock if
called by pci_release_devices, remove the lock to the caller.
b) The iommu_domain_teardown should not ASSERT for the pcidevs_lock
because it just update the domain's vt-d mapping.

Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
17 years agoCPUIDLE: adjust cstate statistic interface
Keir Fraser [Fri, 19 Dec 2008 14:44:40 +0000 (14:44 +0000)]
CPUIDLE: adjust cstate statistic interface

1. change unit of residency, PM ticks -> ns.
2. output C0 usage & residency.

Signed-off-by: Wei Gang <gang.wei@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoVT-d: Fix PCI-X device assignment
Keir Fraser [Fri, 19 Dec 2008 13:42:04 +0000 (13:42 +0000)]
VT-d: Fix PCI-X device assignment

When assign PCI device, current code just map its bridge and its
secondary bus number and devfn 0. It doesn't work for PCI-x device
assignment, because the request may be the source-id in the original
PCI-X transaction or the source-id provided by the bridge. It needs to
map the device itself, and its upstream bridges till PCIe-to-PCI/PCI-x
bridge.

In addition, add description for DEV_TYPE_PCIe_BRIDGE and
DEV_TYPE_PCI_BRIDGE for understandability.

Signed-off-by: Weidong Han <weidong.han@intel.com>
17 years agoxend: Actually restrict a domU's access to xenstore when we mean to --
Keir Fraser [Thu, 18 Dec 2008 17:18:28 +0000 (17:18 +0000)]
xend: Actually restrict a domU's access to xenstore when we mean to --
this means that in some cases it cannot be owner of its own xenstore
nodes.

This bug was pointed out by Daniel Berrange at Red Hat. This patch is
my own more generic fix that automatically covers a range of callers
(albeit the patch is arguably a bit of a hack ;-).

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86: Quieten tracing in msi startup/shutdown handlers.
Keir Fraser [Thu, 18 Dec 2008 17:14:27 +0000 (17:14 +0000)]
x86: Quieten tracing in msi startup/shutdown handlers.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agorombios: Update to Bochs latest
Keir Fraser [Thu, 18 Dec 2008 14:52:53 +0000 (14:52 +0000)]
rombios: Update to Bochs latest

Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
17 years agoxenoprof: Add support for Intel Dunnington cores.
Keir Fraser [Thu, 18 Dec 2008 11:29:33 +0000 (11:29 +0000)]
xenoprof: Add support for Intel Dunnington cores.

Signed-off-by: Xiaowei Yang <Xiaowei.yang@intel.com>
Signed-off-by: Ting Zhou <ting.g.zhou@intel.com>
17 years agox86, shadow: Avoid duplicates in fixup tables.
Keir Fraser [Thu, 18 Dec 2008 11:28:25 +0000 (11:28 +0000)]
x86, shadow: Avoid duplicates in fixup tables.

Avoid entering duplicates in fixup tables, reducing fixup evictions.

Signed-off-by: Gianluca Guida <gianluca.guida@eu.citrix.com>
17 years agoFix mini-os ia64 compilation
Keir Fraser [Thu, 18 Dec 2008 11:27:37 +0000 (11:27 +0000)]
Fix mini-os ia64 compilation

- Avoid nested function to avoid a trampoline.
- Do not link mini-os_app.o when it is empty.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
17 years agox86, hvm: Don't ever call the shadow code to fix a page fault in an
Keir Fraser [Wed, 17 Dec 2008 11:36:22 +0000 (11:36 +0000)]
x86, hvm: Don't ever call the shadow code to fix a page fault in an
external-mode guest if the fault came from Xen; it would be making
changes to the wrong pagetables, potentially causing a pagefault loop
in Xen.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
17 years agoxenpm: add cpu frequency control interface, through which user can
Keir Fraser [Tue, 16 Dec 2008 13:14:25 +0000 (13:14 +0000)]
xenpm: add cpu frequency control interface, through which user can
tune the parameters manually.

Now, xenpm can be invoked with the following options:
Usage:
       xenpm get-cpuidle-states [cpuid]: list cpu idle information on
       CPU cpuid or all CPUs.
       xenpm get-cpufreq-states [cpuid]: list cpu frequency
       information on CPU cpuid or all CPUs.
       xenpm get-cpufreq-para [cpuid]: list cpu frequency information
       on CPU cpuid or all CPUs.
       xenpm set-scaling-maxfreq <cpuid> <HZ>: set max cpu frequency
       <HZ> on CPU <cpuid>.
       xenpm set-scaling-minfreq <cpuid> <HZ>: set min cpu frequency
       <HZ> on CPU <cpuid>.
       xenpm set-scaling-governor <cpuid> <name>: set scaling governor
       on CPU <cpuid>.
       xenpm set-scaling-speed <cpuid> <num>: set scaling speed on CPU
       <cpuid>.
       xenpm set-sampling-rate <cpuid> <num>: set sampling rate on CPU
       <cpuid>.
       xenpm set-up-threshold <cpuid> <num>: set up threshold on CPU
       <cpuid>.

To ease the use of this tool, the shortcut option is supported,
i.e. `xenpm get-cpui' is equal to `xenpm get-cpuidle-states'.

Signed-off-by: Guanqun Lu <guanqun.lu@intel.com>
17 years agox86: Update xen-detect utility to scan for Xen signature in CPUID space.
Keir Fraser [Tue, 16 Dec 2008 12:04:13 +0000 (12:04 +0000)]
x86: Update xen-detect utility to scan for Xen signature in CPUID space.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agomini-os: Make utility function get_self_id() in fs-front.c public.
Keir Fraser [Tue, 16 Dec 2008 12:00:25 +0000 (12:00 +0000)]
mini-os: Make utility function get_self_id() in fs-front.c public.

Signed-off-by: Yosuke Iwamatsu <y-iwamatsu@ab.jp.nec.com>
17 years agox86: Simpler time handling when TSC is constant across all power saving states.
Keir Fraser [Tue, 16 Dec 2008 11:59:22 +0000 (11:59 +0000)]
x86: Simpler time handling when TSC is constant across all power saving states.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Signed-off-by: Gang Wei <gang.wei@intel.com>
17 years agovmx: Do not disable real EFER.NXE even when disabled by guest.
Keir Fraser [Tue, 16 Dec 2008 11:54:11 +0000 (11:54 +0000)]
vmx: Do not disable real EFER.NXE even when disabled by guest.

We must not disable EFER.NXE in host mode since shadow code relies on
accessing shadow mappings with NX set.

We do not want to write EFER on every vmentry/vmexit if we can avoid
it, since it will be somewhat slow.

Finally, we don't believe that any guest relies on NX really being
disabled when EFER.NXE is cleared.

This given, it makes sense to ignore the guest's setting of EFER.NXE.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86: Enable MTF for HVM guest single step in gdb
Keir Fraser [Tue, 16 Dec 2008 11:49:20 +0000 (11:49 +0000)]
x86: Enable MTF for HVM guest single step in gdb

Signed-off-by: Edwin Zhai <edwin.zhai@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86: Decode CPUID for TSC guarantees.
Keir Fraser [Mon, 15 Dec 2008 11:37:14 +0000 (11:37 +0000)]
x86: Decode CPUID for TSC guarantees.

Signed-off-by: Wei Gang <gang.wei@intel.com>
17 years agorombios: fix references to EBDA
Keir Fraser [Mon, 15 Dec 2008 11:23:22 +0000 (11:23 +0000)]
rombios: fix references to EBDA

Extended Bios Data Area (EBDA) can be relocated by the initialization
of PCI option ROM. The IPL boot table is also.
EBDA must be accessed via 0x40E after the initialization.

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
17 years agox86: Small cleanups to time handling.
Keir Fraser [Mon, 15 Dec 2008 11:17:14 +0000 (11:17 +0000)]
x86: Small cleanups to time handling.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoxenpmd: Fix bogus fgets() size parameter.
Keir Fraser [Sat, 13 Dec 2008 17:44:20 +0000 (17:44 +0000)]
xenpmd: Fix bogus fgets() size parameter.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86: Clean up and simplify rwlock implementation.
Keir Fraser [Sat, 13 Dec 2008 15:56:16 +0000 (15:56 +0000)]
x86: Clean up and simplify rwlock implementation.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoClean up use of spin_is_locked() and introduce rw_is_locked().
Keir Fraser [Sat, 13 Dec 2008 15:28:10 +0000 (15:28 +0000)]
Clean up use of spin_is_locked() and introduce rw_is_locked().

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoxc_pm: Fix off-by-one error in string array access.
Keir Fraser [Sat, 13 Dec 2008 15:04:53 +0000 (15:04 +0000)]
xc_pm: Fix off-by-one error in string array access.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
17 years agox86: Fix early time initialisation after recent changes.
Keir Fraser [Sat, 13 Dec 2008 15:02:55 +0000 (15:02 +0000)]
x86: Fix early time initialisation after recent changes.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoxentop: Fix fprintf() build failure.
Keir Fraser [Thu, 11 Dec 2008 22:32:20 +0000 (22:32 +0000)]
xentop: Fix fprintf() build failure.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agohvmloader: enable bus mastering of PCI device
Keir Fraser [Thu, 11 Dec 2008 13:26:02 +0000 (13:26 +0000)]
hvmloader: enable bus mastering of PCI device

Without this, init routine in some PCI option ROM doesn't work well.

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
17 years agox86: enable interrupts explicitly in __start_xen()
Keir Fraser [Thu, 11 Dec 2008 13:25:28 +0000 (13:25 +0000)]
x86: enable interrupts explicitly in __start_xen()

Instead of relying on smp_prepare_cpus() (via check_nmi_watchdog()) or
init_xen_time() (via init_platform_timer() -> plt_overflow())
implicitly enabling interrupts, enable them explicitly once safe to do
so (it may actually be possible to move this even further up, but I
don't think that would buy us much).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Also move spin_debug_enable() a bit higer. Moving it above
smp_prepare_cpus() didn't work for some reason though!

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86: Clean up early time setup.
Keir Fraser [Thu, 11 Dec 2008 13:10:19 +0000 (13:10 +0000)]
x86: Clean up early time setup.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agovga: Only vga_endboot() if vga_init() completed.
Keir Fraser [Thu, 11 Dec 2008 13:09:59 +0000 (13:09 +0000)]
vga: Only vga_endboot() if vga_init() completed.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86, cpufreq: Change cpufreq_driver->get so that it can get other
Keir Fraser [Thu, 11 Dec 2008 11:49:37 +0000 (11:49 +0000)]
x86, cpufreq: Change cpufreq_driver->get so that it can get other
cpu's real physical freq.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
17 years agoRe-enable MSI support
Keir Fraser [Thu, 11 Dec 2008 11:48:19 +0000 (11:48 +0000)]
Re-enable MSI support

Currently the MSI is disabled because of some lock issue. This patch
tries to clean up the locking related to MSI lock.

Signed-off-by: Jiang Yunhong <yunhong.jiang@intel.com>
17 years agox86: fix the potential of encountering panic "IO-APIC + timer doesn't work! ..."
Keir Fraser [Thu, 11 Dec 2008 11:40:10 +0000 (11:40 +0000)]
x86: fix the potential of encountering panic "IO-APIC + timer doesn't work! ..."

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Linux commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4aae07025265151e3f7041dfbf0f529e122de1d8

x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"

Under rare circumstances we found we could have an IRQ0 entry while we
are in the middle of setting up the local APIC, the i8259A and the
PIT. That is certainly not how it's supposed to work! check_timer()
was supposed to be called with irqs turned off - but this eroded away
sometime in the past. This code would still work most of the time
because this code runs very quickly, but just the right timing
conditions are present and IRQ0 hits in this small, ~30 usecs window,
timer irqs stop and the system does not boot up. Also, given how early
this is during bootup, the hang is very deterministic - but it would
only occur on certain machines (and certain configs).

The fix was quite simple: disable/restore interrupts properly in this
function. With that in place the test-system now boots up just fine.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
17 years agox86: unify local_irq_XXX()
Keir Fraser [Thu, 11 Dec 2008 11:36:00 +0000 (11:36 +0000)]
x86: unify local_irq_XXX()

This also removes an inconsistency in that x86-64's __save_flags() had
a memory clobber, while x86_32's didn't.

It further adds type checking since blindly using {pop,push}{l,q} on a
memory operand of unknown size bares the risk of corrupting other
data.

Finally, it eliminates the redundant (with local_irq_restore())
__restore_flags() macro and renames __save_flags() to
local_save_flags(), making the naming consistent with Linux (again?).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
17 years agorombios: fix rom_scan (ja->jmp)
Keir Fraser [Thu, 11 Dec 2008 11:32:39 +0000 (11:32 +0000)]
rombios: fix rom_scan (ja->jmp)

Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
17 years agoFix a typo caused by 18898.
Keir Fraser [Thu, 11 Dec 2008 11:30:13 +0000 (11:30 +0000)]
Fix a typo caused by 18898.

new state is updated too early.

Signed-off-by: Kevin Tian <kevin.tian@intel.com>
17 years agocpufreq: Short path avoiding IPI in critical fast path.
Keir Fraser [Thu, 11 Dec 2008 11:27:49 +0000 (11:27 +0000)]
cpufreq: Short path avoiding IPI in critical fast path.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agolibxc: Fix xc_pm.c build by avoiding bogus header includes.
Keir Fraser [Thu, 11 Dec 2008 11:19:27 +0000 (11:19 +0000)]
libxc: Fix xc_pm.c build by avoiding bogus header includes.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoFix BUILD_BUG_ON()
Keir Fraser [Thu, 11 Dec 2008 11:19:01 +0000 (11:19 +0000)]
Fix BUILD_BUG_ON()

As was noticed on the Linux side, using an array here isn't appropriate
if the condition is not a compile time constant - gcc allows such
arrays, and hence the intended effect of producing a compiler error is
not achieved in that case. Bit field widths do not know similar
language extensions, and hence always produce a compiler error.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
17 years agoAvoid negative runstate pieces.
Keir Fraser [Wed, 10 Dec 2008 14:05:41 +0000 (14:05 +0000)]
Avoid negative runstate pieces.

Also consolidate all places to get cpu idle time.

Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoInitialize state_entry_time to zero for all idle vcpus
Keir Fraser [Wed, 10 Dec 2008 13:41:34 +0000 (13:41 +0000)]
Initialize state_entry_time to zero for all idle vcpus

NOW() is not usable since xen time sub-system hasn't
been initialized yet. On my box, it gives a initial
stamp ~60s due to local tsc stamp as zero and TSC
count is started from power on. Then a negative value
is added to runstate of that idle vcpu at schedule
point. The net effect is for some tool like xenpm
to show a big idle time gap between BSP and other APs.

Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86: Make MCE panic message more obvious
Keir Fraser [Wed, 10 Dec 2008 13:30:10 +0000 (13:30 +0000)]
x86: Make MCE panic message more obvious

Make it more obvious to the untrained user that machine check reboots
are hardware faults, rather then just saying "CPU context corrupt".

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
17 years agogdbserver: Fix build failure.
Keir Fraser [Wed, 10 Dec 2008 13:28:58 +0000 (13:28 +0000)]
gdbserver: Fix build failure.

From: Edwin Zhai <edwin.zhai@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoAdd user PM control interface
Keir Fraser [Wed, 10 Dec 2008 13:27:41 +0000 (13:27 +0000)]
Add user PM control interface

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
17 years agoAdd cpufreq governors: performance, powersave, userspace
Keir Fraser [Wed, 10 Dec 2008 13:27:14 +0000 (13:27 +0000)]
Add cpufreq governors: performance, powersave, userspace

This patch add 3 more governors beside original running ondemand
cpufreq governor.
performance governor is with best performance, keeping cpu always
running at highest freq;
powersave governor is with best power save effect, keeping cpu always
running at lowest freq;
userspace governor provide user setting freq ability;

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
17 years agolibxc: Fix memory leak in zlib usage
Keir Fraser [Wed, 10 Dec 2008 13:14:13 +0000 (13:14 +0000)]
libxc: Fix memory leak in zlib usage

Any call to inflate() must be followed by inflateEnd(), otherwise the
internal zlib state is leaked.

Signed-off-by: Kevin Wolf <kwolf@suse.de>
17 years agoUse virtual 8086 mode for VMX guests with CR0.PE == 0
Keir Fraser [Tue, 9 Dec 2008 16:28:02 +0000 (16:28 +0000)]
Use virtual 8086 mode for VMX guests with CR0.PE == 0

When a VMX guest tries to enter real mode, put it in virtual 8086 mode
instead, if that's possible.  Handle all errors and corner cases by
falling back to the real-mode emulator.

This is similar to the old VMXASSIST system except it uses Xen's
x86_emulate emulator instead of having a partial emulator in the guest
firmware.  It more than doubles the speed of real-mode operation on
VMX.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
17 years agovga: Fix screen clear at end of Xen bootstrap.
Keir Fraser [Tue, 9 Dec 2008 13:23:15 +0000 (13:23 +0000)]
vga: Fix screen clear at end of Xen bootstrap.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agopv-on-hvm: add pvSCSI frontend
Keir Fraser [Tue, 9 Dec 2008 13:06:19 +0000 (13:06 +0000)]
pv-on-hvm: add pvSCSI frontend

Signed-off-by: Tomonari Horikoshi <t.horikoshi@jp.fujitsu.com>
Signed-off-by: Jun Kamada <kama@jp.fujitsu.com>
17 years agopv-on-hvm: fix for Centos 5.2
Keir Fraser [Tue, 9 Dec 2008 13:00:52 +0000 (13:00 +0000)]
pv-on-hvm: fix for Centos 5.2

From: Yoshisato YANAGISAWA <yanagisawa.yoshisato@lab.ntt.co.jp>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoVT-d: check return value of pirq_guest_bind()
Keir Fraser [Tue, 9 Dec 2008 12:55:29 +0000 (12:55 +0000)]
VT-d: check return value of pirq_guest_bind()

The eliminates a hypervisor crash when the respective domain dies or
gets the device hot removed.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Reviewed-by: Weidong Han <weidong.han@intel.com>
17 years agotools: Fix a few error-path memory leaks.
Keir Fraser [Tue, 9 Dec 2008 12:53:19 +0000 (12:53 +0000)]
tools: Fix a few error-path memory leaks.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agoxend: Remember bootable flag for vbds in xenstore
Keir Fraser [Tue, 9 Dec 2008 12:45:45 +0000 (12:45 +0000)]
xend: Remember bootable flag for vbds in xenstore

When xend is restarted, bootable flags of all disk devices are lost
and then the first disk is marked as bootable by a "compatibility
hack". When a guest domain is created with a mixture of several vbd
and tap devices, the compatibility hack may fail to choose the right
bootable device. Thus preventing the guest to be restarted. This patch
fixes this behavior by remembering bootable flag for each disk device
in xenstore database.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
17 years agoxend: Fix memory allocation bug after hvm reboot in numa system
Keir Fraser [Tue, 9 Dec 2008 12:44:32 +0000 (12:44 +0000)]
xend: Fix memory allocation bug after hvm reboot in numa system

Recently we find a bug on Nahelem machine (totally with two nodes, 6G
memory (3G in each node):
- Start a HVM guest with its all VCPUS pinned to node1, so all its
memory is allocated from node1.
- Reboot the HVM.
- There will be some memory allocated from node0 even there is enough
free memory on node1.

Reason: For security issues, xen will not put all the pages of a dying
hvm to domheap directly, but put them in scrub list and wait for handled
by page_scrub_softirq(). If the dying hvm have a lot of memory,
page_scrub_softirq() will not handle all of them before the start the
hvm. There are some pages belong to node1 still in scrub list, new hvm
can't use pages in it. So this hvm will get different memory
distribution than before. Before changeset 18304, page_scrub_softirq()
can be excuted parallel between all the cpus. Changeset 18305
serialise page_scrub_softirq() and Changeset 18307 serialise
page_scrub_softirq() with a new lock to avoid holding up acquiring
page_scrub_lock in free_domheap_pages(). Those changeset slow the ability
to handle pages in scrub list. So the bug becomes more obvious after.

Patch: This patch modifiers balloon.free to avoid this bug. After
patch, balloon.free will check whether current machine is a numa
system and the new created hvm has all its vcpus in the same node. If
all the conditions above fit, we will wait until all the pages in
scrub list are freed (if waiting time go beyond 20s, we will stop
waiting it.).

This seems to be too restricted at the first glance. We used to only
wait for the free memory size of pinned node is bigger than
required. But as we know HVM memory alloction granularity is 2M. Even
the former condition is satisfied, we still may not find enough
2M-size memory on that node.

Signed-off-by: Ting Zhou <ting.g.zhou@intel.com>
Signed-off-by: Xiaowei Yang <Xiaowei.yang@intel.com>
17 years agolibxc: Fix gcc 4.3 build failure
Keir Fraser [Tue, 9 Dec 2008 12:42:18 +0000 (12:42 +0000)]
libxc: Fix gcc 4.3 build failure

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
17 years agorombios: support BCV
Keir Fraser [Tue, 9 Dec 2008 12:41:12 +0000 (12:41 +0000)]
rombios: support BCV

Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
17 years agoFix domain save when guest is in S3.
Keir Fraser [Fri, 5 Dec 2008 15:54:22 +0000 (15:54 +0000)]
Fix domain save when guest is in S3.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agox86: make an error message more precise
Keir Fraser [Fri, 5 Dec 2008 15:24:12 +0000 (15:24 +0000)]
x86: make an error message more precise

... allowing to distinguish whether the to be added or the already
existing PIRQ binding is causing the failure.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
17 years agocpufreq: allow customization of some parameters
Keir Fraser [Fri, 5 Dec 2008 15:23:32 +0000 (15:23 +0000)]
cpufreq: allow customization of some parameters

Short of having a way for powersaved to dynamically adjust these
values, at least allow specifying them on the command line. In
particular, always running at an up-threshold of 80% is perhaps nice
for laptop use, but certainly not desirable on servers. On shell
scripts invoking large numbers of short-lived processes I noticed a
50% performance degradation on a dual-socket quad-core Barcelona just
because of the load of an individual core never crossing the 80%
boundary that would have resulted in increasing the frequency.

(Powersaved on SLE10 sets this on native kernels to 60% or 80%,
depending on whether performance or power reduction is preferred,
*divided* by the number of CPUs, but capped at the lower limit of
20%.)

Signed-off-by: Jan Beulich <jbeulich@novell.com>
17 years agox86/cpufreq: reduce verbosity
Keir Fraser [Fri, 5 Dec 2008 15:22:43 +0000 (15:22 +0000)]
x86/cpufreq: reduce verbosity

These messages don't exist in powernow's equivalent code, and are
pretty useless anyway, hence just cluttering the logs.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
17 years agopowernow: implement struct cpufreq_driver.verify
Keir Fraser [Fri, 5 Dec 2008 15:22:21 +0000 (15:22 +0000)]
powernow: implement struct cpufreq_driver.verify

Without this, under rare conditions hypervisor crashes are possible
due to this method being called without checking against NULL.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
17 years agox86/32on64: adjust address when converting syscall to fault
Keir Fraser [Fri, 5 Dec 2008 15:21:59 +0000 (15:21 +0000)]
x86/32on64: adjust address when converting syscall to fault

The faulting address is at the start of the syscall instruction rather
than at the following one.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
17 years agox86, time: Fix scale_reciprocal().
Keir Fraser [Fri, 5 Dec 2008 14:46:38 +0000 (14:46 +0000)]
x86, time: Fix scale_reciprocal().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agominios: Clip memory not usable by Mini-OS (above 1GB)
Keir Fraser [Fri, 5 Dec 2008 13:06:57 +0000 (13:06 +0000)]
minios: Clip memory not usable by Mini-OS (above 1GB)

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
17 years agocpuidle: revise tsc-save/restore to reduce tsc skew between cpus
Keir Fraser [Fri, 5 Dec 2008 13:03:44 +0000 (13:03 +0000)]
cpuidle: revise tsc-save/restore to reduce tsc skew between cpus

Signed-off-by: Wei Gang <gang.wei@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
17 years agovga: Clear the screen when relinquishing VGA to dom0.
Keir Fraser [Fri, 5 Dec 2008 11:37:20 +0000 (11:37 +0000)]
vga: Clear the screen when relinquishing VGA to dom0.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>