GMMR0.cpp@ 92386

Last change on this file since 92386 was 92368, checked in by vboxsync, 3 years ago
VMM/PGM,GMM: Baked PGMR3PhysAllocateLargePage into PGMR0PhysAllocateLargePage eliminating VMMCALLRING3_PGM_ALLOCATE_LARGE_HANDY_PAGE; adjusted GMMR0AllocateLargePage to be ring-0 callable. Changed the large page allocation backoff logic a bit. Some more release stats. bugref:10093
Property svn:eol-style set to `native` Property svn:keywords set to `Id Revision`
File size: 202.7 KB

Line
1	/* $Id: GMMR0.cpp 92368 2021-11-11 13:31:21Z vboxsync $ */
2	/** @file
3	* GMM - Global Memory Manager.
4	*/
5
6	/*
7	* Copyright (C) 2007-2020 Oracle Corporation
8	*
9	* This file is part of VirtualBox Open Source Edition (OSE), as
10	* available from http://www.virtualbox.org. This file is free software;
11	* you can redistribute it and/or modify it under the terms of the GNU
12	* General Public License (GPL) as published by the Free Software
13	* Foundation, in version 2 as it comes in the "COPYING" file of the
14	* VirtualBox OSE distribution. VirtualBox OSE is distributed in the
15	* hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
16	*/
17
18
19	/** @page pg_gmm GMM - The Global Memory Manager
20	*
21	* As the name indicates, this component is responsible for global memory
22	* management. Currently only guest RAM is allocated from the GMM, but this
23	* may change to include shadow page tables and other bits later.
24	*
25	* Guest RAM is managed as individual pages, but allocated from the host OS
26	* in chunks for reasons of portability / efficiency. To minimize the memory
27	* footprint all tracking structure must be as small as possible without
28	* unnecessary performance penalties.
29	*
30	* The allocation chunks has fixed sized, the size defined at compile time
31	* by the #GMM_CHUNK_SIZE \#define.
32	*
33	* Each chunk is given an unique ID. Each page also has a unique ID. The
34	* relationship between the two IDs is:
35	* @code
36	* GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
37	* idPage = (idChunk << GMM_CHUNK_SHIFT) \| iPage;
38	* @endcode
39	* Where iPage is the index of the page within the chunk. This ID scheme
40	* permits for efficient chunk and page lookup, but it relies on the chunk size
41	* to be set at compile time. The chunks are organized in an AVL tree with their
42	* IDs being the keys.
43	*
44	* The physical address of each page in an allocation chunk is maintained by
45	* the #RTR0MEMOBJ and obtained using #RTR0MemObjGetPagePhysAddr. There is no
46	* need to duplicate this information (it'll cost 8-bytes per page if we did).
47	*
48	* So what do we need to track per page? Most importantly we need to know
49	* which state the page is in:
50	* - Private - Allocated for (eventually) backing one particular VM page.
51	* - Shared - Readonly page that is used by one or more VMs and treated
52	* as COW by PGM.
53	* - Free - Not used by anyone.
54	*
55	* For the page replacement operations (sharing, defragmenting and freeing)
56	* to be somewhat efficient, private pages needs to be associated with a
57	* particular page in a particular VM.
58	*
59	* Tracking the usage of shared pages is impractical and expensive, so we'll
60	* settle for a reference counting system instead.
61	*
62	* Free pages will be chained on LIFOs
63	*
64	* On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit
65	* systems a 32-bit bitfield will have to suffice because of address space
66	* limitations. The #GMMPAGE structure shows the details.
67	*
68	*
69	* @section sec_gmm_alloc_strat Page Allocation Strategy
70	*
71	* The strategy for allocating pages has to take fragmentation and shared
72	* pages into account, or we may end up with with 2000 chunks with only
73	* a few pages in each. Shared pages cannot easily be reallocated because
74	* of the inaccurate usage accounting (see above). Private pages can be
75	* reallocated by a defragmentation thread in the same manner that sharing
76	* is done.
77	*
78	* The first approach is to manage the free pages in two sets depending on
79	* whether they are mainly for the allocation of shared or private pages.
80	* In the initial implementation there will be almost no possibility for
81	* mixing shared and private pages in the same chunk (only if we're really
82	* stressed on memory), but when we implement forking of VMs and have to
83	* deal with lots of COW pages it'll start getting kind of interesting.
84	*
85	* The sets are lists of chunks with approximately the same number of
86	* free pages. Say the chunk size is 1MB, meaning 256 pages, and a set
87	* consists of 16 lists. So, the first list will contain the chunks with
88	* 1-7 free pages, the second covers 8-15, and so on. The chunks will be
89	* moved between the lists as pages are freed up or allocated.
90	*
91	*
92	* @section sec_gmm_costs Costs
93	*
94	* The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ
95	* entails. In addition there is the chunk cost of approximately
96	* (sizeof(RT0MEMOBJ) + sizeof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.
97	*
98	* On Windows the per page #RTR0MEMOBJ cost is 32-bit on 32-bit windows
99	* and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page.
100	* The cost on Linux is identical, but here it's because of sizeof(struct page *).
101	*
102	*
103	* @section sec_gmm_legacy Legacy Mode for Non-Tier-1 Platforms
104	*
105	* In legacy mode the page source is locked user pages and not
106	* #RTR0MemObjAllocPhysNC, this means that a page can only be allocated
107	* by the VM that locked it. We will make no attempt at implementing
108	* page sharing on these systems, just do enough to make it all work.
109	*
110	* @note With 6.1 really dropping 32-bit support, the legacy mode is obsoleted
111	* under the assumption that there is sufficient kernel virtual address
112	* space to map all of the guest memory allocations. So, we'll be using
113	* #RTR0MemObjAllocPage on some platforms as an alternative to
114	* #RTR0MemObjAllocPhysNC.
115	*
116	*
117	* @subsection sub_gmm_locking Serializing
118	*
119	* One simple fast mutex will be employed in the initial implementation, not
120	* two as mentioned in @ref sec_pgmPhys_Serializing.
121	*
122	* @see @ref sec_pgmPhys_Serializing
123	*
124	*
125	* @section sec_gmm_overcommit Memory Over-Commitment Management
126	*
127	* The GVM will have to do the system wide memory over-commitment
128	* management. My current ideas are:
129	* - Per VM oc policy that indicates how much to initially commit
130	* to it and what to do in a out-of-memory situation.
131	* - Prevent overtaxing the host.
132	*
133	* There are some challenges here, the main ones are configurability and
134	* security. Should we for instance permit anyone to request 100% memory
135	* commitment? Who should be allowed to do runtime adjustments of the
136	* config. And how to prevent these settings from being lost when the last
137	* VM process exits? The solution is probably to have an optional root
138	* daemon the will keep VMMR0.r0 in memory and enable the security measures.
139	*
140	*
141	*
142	* @section sec_gmm_numa NUMA
143	*
144	* NUMA considerations will be designed and implemented a bit later.
145	*
146	* The preliminary guesses is that we will have to try allocate memory as
147	* close as possible to the CPUs the VM is executed on (EMT and additional CPU
148	* threads). Which means it's mostly about allocation and sharing policies.
149	* Both the scheduler and allocator interface will to supply some NUMA info
150	* and we'll need to have a way to calc access costs.
151	*
152	*/
153
154
155	/*********************************************************************************************************************************
156	* Header Files *
157	*********************************************************************************************************************************/
158	#define LOG_GROUP LOG_GROUP_GMM
159	#include <VBox/rawpci.h>
160	#include <VBox/vmm/gmm.h>
161	#include "GMMR0Internal.h"
162	#include <VBox/vmm/vmcc.h>
163	#include <VBox/vmm/pgm.h>
164	#include <VBox/log.h>
165	#include <VBox/param.h>
166	#include <VBox/err.h>
167	#include <VBox/VMMDev.h>
168	#include <iprt/asm.h>
169	#include <iprt/avl.h>
170	#ifdef VBOX_STRICT
171	# include <iprt/crc.h>
172	#endif
173	#include <iprt/critsect.h>
174	#include <iprt/list.h>
175	#include <iprt/mem.h>
176	#include <iprt/memobj.h>
177	#include <iprt/mp.h>
178	#include <iprt/semaphore.h>
179	#include <iprt/spinlock.h>
180	#include <iprt/string.h>
181	#include <iprt/time.h>
182
183	/* This is 64-bit only code now. */
184	#if HC_ARCH_BITS != 64 \|\| ARCH_BITS != 64
185	# error "This is 64-bit only code"
186	#endif
187
188
189	/*********************************************************************************************************************************
190	* Defined Constants And Macros *
191	*********************************************************************************************************************************/
192	/** @def VBOX_USE_CRIT_SECT_FOR_GIANT
193	* Use a critical section instead of a fast mutex for the giant GMM lock.
194	*
195	* @remarks This is primarily a way of avoiding the deadlock checks in the
196	* windows driver verifier. */
197	#if defined(RT_OS_WINDOWS) \|\| defined(RT_OS_DARWIN) \|\| defined(DOXYGEN_RUNNING)
198	# define VBOX_USE_CRIT_SECT_FOR_GIANT
199	#endif
200
201
202	/*********************************************************************************************************************************
203	* Structures and Typedefs *
204	*********************************************************************************************************************************/
205	/** Pointer to set of free chunks. */
206	typedef struct GMMCHUNKFREESET *PGMMCHUNKFREESET;
207
208	/**
209	* The per-page tracking structure employed by the GMM.
210	*
211	* Because of the different layout on 32-bit and 64-bit hosts in earlier
212	* versions of the code, macros are used to get and set some of the data.
213	*/
214	typedef union GMMPAGE
215	{
216	/** Unsigned integer view. */
217	uint64_t u;
218
219	/** The common view. */
220	struct GMMPAGECOMMON
221	{
222	uint32_t uStuff1 : 32;
223	uint32_t uStuff2 : 30;
224	/** The page state. */
225	uint32_t u2State : 2;
226	} Common;
227
228	/** The view of a private page. */
229	struct GMMPAGEPRIVATE
230	{
231	/** The guest page frame number. (Max addressable: 2 ^ 44 - 16) */
232	uint32_t pfn;
233	/** The GVM handle. (64K VMs) */
234	uint32_t hGVM : 16;
235	/** Reserved. */
236	uint32_t u16Reserved : 14;
237	/** The page state. */
238	uint32_t u2State : 2;
239	} Private;
240
241	/** The view of a shared page. */
242	struct GMMPAGESHARED
243	{
244	/** The host page frame number. (Max addressable: 2 ^ 44 - 16) */
245	uint32_t pfn;
246	/** The reference count (64K VMs). */
247	uint32_t cRefs : 16;
248	/** Used for debug checksumming. */
249	uint32_t u14Checksum : 14;
250	/** The page state. */
251	uint32_t u2State : 2;
252	} Shared;
253
254	/** The view of a free page. */
255	struct GMMPAGEFREE
256	{
257	/** The index of the next page in the free list. UINT16_MAX is NIL. */
258	uint16_t iNext;
259	/** Reserved. Checksum or something? */
260	uint16_t u16Reserved0;
261	/** Reserved. Checksum or something? */
262	uint32_t u30Reserved1 : 29;
263	/** Set if the page was zeroed. */
264	uint32_t fZeroed : 1;
265	/** The page state. */
266	uint32_t u2State : 2;
267	} Free;
268	} GMMPAGE;
269	AssertCompileSize(GMMPAGE, sizeof(RTHCUINTPTR));
270	/** Pointer to a GMMPAGE. */
271	typedef GMMPAGE *PGMMPAGE;
272
273
274	/** @name The Page States.
275	* @{ */
276	/** A private page. */
277	#define GMM_PAGE_STATE_PRIVATE 0
278	/** A shared page. */
279	#define GMM_PAGE_STATE_SHARED 2
280	/** A free page. */
281	#define GMM_PAGE_STATE_FREE 3
282	/** @} */
283
284
285	/** @def GMM_PAGE_IS_PRIVATE
286	*
287	* @returns true if private, false if not.
288	* @param pPage The GMM page.
289	*/
290	#define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_PRIVATE )
291
292	/** @def GMM_PAGE_IS_SHARED
293	*
294	* @returns true if shared, false if not.
295	* @param pPage The GMM page.
296	*/
297	#define GMM_PAGE_IS_SHARED(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_SHARED )
298
299	/** @def GMM_PAGE_IS_FREE
300	*
301	* @returns true if free, false if not.
302	* @param pPage The GMM page.
303	*/
304	#define GMM_PAGE_IS_FREE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_FREE )
305
306	/** @def GMM_PAGE_PFN_LAST
307	* The last valid guest pfn range.
308	* @remark Some of the values outside the range has special meaning,
309	* see GMM_PAGE_PFN_UNSHAREABLE.
310	*/
311	#define GMM_PAGE_PFN_LAST UINT32_C(0xfffffff0)
312	AssertCompile(GMM_PAGE_PFN_LAST == (GMM_GCPHYS_LAST >> PAGE_SHIFT));
313
314	/** @def GMM_PAGE_PFN_UNSHAREABLE
315	* Indicates that this page isn't used for normal guest memory and thus isn't shareable.
316	*/
317	#define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0xfffffff1)
318	AssertCompile(GMM_PAGE_PFN_UNSHAREABLE == (GMM_GCPHYS_UNSHAREABLE >> PAGE_SHIFT));
319
320
321	/**
322	* A GMM allocation chunk ring-3 mapping record.
323	*
324	* This should really be associated with a session and not a VM, but
325	* it's simpler to associated with a VM and cleanup with the VM object
326	* is destroyed.
327	*/
328	typedef struct GMMCHUNKMAP
329	{
330	/** The mapping object. */
331	RTR0MEMOBJ hMapObj;
332	/** The VM owning the mapping. */
333	PGVM pGVM;
334	} GMMCHUNKMAP;
335	/** Pointer to a GMM allocation chunk mapping. */
336	typedef struct GMMCHUNKMAP *PGMMCHUNKMAP;
337
338
339	/**
340	* A GMM allocation chunk.
341	*/
342	typedef struct GMMCHUNK
343	{
344	/** The AVL node core.
345	* The Key is the chunk ID. (Giant mtx.) */
346	AVLU32NODECORE Core;
347	/** The memory object.
348	* Either from RTR0MemObjAllocPhysNC or RTR0MemObjLockUser depending on
349	* what the host can dish up with. (Chunk mtx protects mapping accesses
350	* and related frees.) */
351	RTR0MEMOBJ hMemObj;
352	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
353	/** Pointer to the kernel mapping. */
354	uint8_t *pbMapping;
355	#endif
356	/** Pointer to the next chunk in the free list. (Giant mtx.) */
357	PGMMCHUNK pFreeNext;
358	/** Pointer to the previous chunk in the free list. (Giant mtx.) */
359	PGMMCHUNK pFreePrev;
360	/** Pointer to the free set this chunk belongs to. NULL for
361	* chunks with no free pages. (Giant mtx.) */
362	PGMMCHUNKFREESET pSet;
363	/** List node in the chunk list (GMM::ChunkList). (Giant mtx.) */
364	RTLISTNODE ListNode;
365	/** Pointer to an array of mappings. (Chunk mtx.) */
366	PGMMCHUNKMAP paMappingsX;
367	/** The number of mappings. (Chunk mtx.) */
368	uint16_t cMappingsX;
369	/** The mapping lock this chunk is using using. UINT8_MAX if nobody is mapping
370	* or freeing anything. (Giant mtx.) */
371	uint8_t volatile iChunkMtx;
372	/** GMM_CHUNK_FLAGS_XXX. (Giant mtx.) */
373	uint8_t fFlags;
374	/** The head of the list of free pages. UINT16_MAX is the NIL value.
375	* (Giant mtx.) */
376	uint16_t iFreeHead;
377	/** The number of free pages. (Giant mtx.) */
378	uint16_t cFree;
379	/** The GVM handle of the VM that first allocated pages from this chunk, this
380	* is used as a preference when there are several chunks to choose from.
381	* When in bound memory mode this isn't a preference any longer. (Giant
382	* mtx.) */
383	uint16_t hGVM;
384	/** The ID of the NUMA node the memory mostly resides on. (Reserved for
385	* future use.) (Giant mtx.) */
386	uint16_t idNumaNode;
387	/** The number of private pages. (Giant mtx.) */
388	uint16_t cPrivate;
389	/** The number of shared pages. (Giant mtx.) */
390	uint16_t cShared;
391	/** The UID this chunk is associated with. */
392	RTUID uidOwner;
393	uint32_t u32Padding;
394	/** The pages. (Giant mtx.) */
395	GMMPAGE aPages[GMM_CHUNK_SIZE >> PAGE_SHIFT];
396	} GMMCHUNK;
397
398	/** Indicates that the NUMA properies of the memory is unknown. */
399	#define GMM_CHUNK_NUMA_ID_UNKNOWN UINT16_C(0xfffe)
400
401	/** @name GMM_CHUNK_FLAGS_XXX - chunk flags.
402	* @{ */
403	/** Indicates that the chunk is a large page (2MB). */
404	#define GMM_CHUNK_FLAGS_LARGE_PAGE UINT16_C(0x0001)
405	/** @} */
406
407
408	/**
409	* An allocation chunk TLB entry.
410	*/
411	typedef struct GMMCHUNKTLBE
412	{
413	/** The chunk id. */
414	uint32_t idChunk;
415	/** Pointer to the chunk. */
416	PGMMCHUNK pChunk;
417	} GMMCHUNKTLBE;
418	/** Pointer to an allocation chunk TLB entry. */
419	typedef GMMCHUNKTLBE *PGMMCHUNKTLBE;
420
421
422	/** The number of entries in the allocation chunk TLB. */
423	#define GMM_CHUNKTLB_ENTRIES 32
424	/** Gets the TLB entry index for the given Chunk ID. */
425	#define GMM_CHUNKTLB_IDX(idChunk) ( (idChunk) & (GMM_CHUNKTLB_ENTRIES - 1) )
426
427	/**
428	* An allocation chunk TLB.
429	*/
430	typedef struct GMMCHUNKTLB
431	{
432	/** The TLB entries. */
433	GMMCHUNKTLBE aEntries[GMM_CHUNKTLB_ENTRIES];
434	} GMMCHUNKTLB;
435	/** Pointer to an allocation chunk TLB. */
436	typedef GMMCHUNKTLB *PGMMCHUNKTLB;
437
438
439	/**
440	* The GMM instance data.
441	*/
442	typedef struct GMM
443	{
444	/** Magic / eye catcher. GMM_MAGIC */
445	uint32_t u32Magic;
446	/** The number of threads waiting on the mutex. */
447	uint32_t cMtxContenders;
448	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
449	/** The critical section protecting the GMM.
450	* More fine grained locking can be implemented later if necessary. */
451	RTCRITSECT GiantCritSect;
452	#else
453	/** The fast mutex protecting the GMM.
454	* More fine grained locking can be implemented later if necessary. */
455	RTSEMFASTMUTEX hMtx;
456	#endif
457	#ifdef VBOX_STRICT
458	/** The current mutex owner. */
459	RTNATIVETHREAD hMtxOwner;
460	#endif
461	/** Spinlock protecting the AVL tree.
462	* @todo Make this a read-write spinlock as we should allow concurrent
463	* lookups. */
464	RTSPINLOCK hSpinLockTree;
465	/** The chunk tree.
466	* Protected by hSpinLockTree. */
467	PAVLU32NODECORE pChunks;
468	/** Chunk freeing generation - incremented whenever a chunk is freed. Used
469	* for validating the per-VM chunk TLB entries. Valid range is 1 to 2^62
470	* (exclusive), though higher numbers may temporarily occure while
471	* invalidating the individual TLBs during wrap-around processing. */
472	uint64_t volatile idFreeGeneration;
473	/** The chunk TLB.
474	* Protected by hSpinLockTree. */
475	GMMCHUNKTLB ChunkTLB;
476	/** The private free set. */
477	GMMCHUNKFREESET PrivateX;
478	/** The shared free set. */
479	GMMCHUNKFREESET Shared;
480
481	/** Shared module tree (global).
482	* @todo separate trees for distinctly different guest OSes. */
483	PAVLLU32NODECORE pGlobalSharedModuleTree;
484	/** Sharable modules (count of nodes in pGlobalSharedModuleTree). */
485	uint32_t cShareableModules;
486
487	/** The chunk list. For simplifying the cleanup process and avoid tree
488	* traversal. */
489	RTLISTANCHOR ChunkList;
490
491	/** The maximum number of pages we're allowed to allocate.
492	* @gcfgm{GMM/MaxPages,64-bit, Direct.}
493	* @gcfgm{GMM/PctPages,32-bit, Relative to the number of host pages.} */
494	uint64_t cMaxPages;
495	/** The number of pages that has been reserved.
496	* The deal is that cReservedPages - cOverCommittedPages <= cMaxPages. */
497	uint64_t cReservedPages;
498	/** The number of pages that we have over-committed in reservations. */
499	uint64_t cOverCommittedPages;
500	/** The number of actually allocated (committed if you like) pages. */
501	uint64_t cAllocatedPages;
502	/** The number of pages that are shared. A subset of cAllocatedPages. */
503	uint64_t cSharedPages;
504	/** The number of pages that are actually shared between VMs. */
505	uint64_t cDuplicatePages;
506	/** The number of pages that are shared that has been left behind by
507	* VMs not doing proper cleanups. */
508	uint64_t cLeftBehindSharedPages;
509	/** The number of allocation chunks.
510	* (The number of pages we've allocated from the host can be derived from this.) */
511	uint32_t cChunks;
512	/** The number of current ballooned pages. */
513	uint64_t cBalloonedPages;
514
515	#ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
516	/** Whether #RTR0MemObjAllocPhysNC works. */
517	bool fHasWorkingAllocPhysNC;
518	#else
519	bool fPadding;
520	#endif
521	/** The bound memory mode indicator.
522	* When set, the memory will be bound to a specific VM and never
523	* shared. This is always set if fLegacyAllocationMode is set.
524	* (Also determined at initialization time.) */
525	bool fBoundMemoryMode;
526	/** The number of registered VMs. */
527	uint16_t cRegisteredVMs;
528
529	/** The index of the next mutex to use. */
530	uint32_t iNextChunkMtx;
531	/** Chunk locks for reducing lock contention without having to allocate
532	* one lock per chunk. */
533	struct
534	{
535	/** The mutex */
536	RTSEMFASTMUTEX hMtx;
537	/** The number of threads currently using this mutex. */
538	uint32_t volatile cUsers;
539	} aChunkMtx[64];
540
541	/** The number of freed chunks ever. This is used as list generation to
542	* avoid restarting the cleanup scanning when the list wasn't modified. */
543	uint32_t volatile cFreedChunks;
544	/** The previous allocated Chunk ID.
545	* Used as a hint to avoid scanning the whole bitmap. */
546	uint32_t idChunkPrev;
547	/** Spinlock protecting idChunkPrev & bmChunkId. */
548	RTSPINLOCK hSpinLockChunkId;
549	/** Chunk ID allocation bitmap.
550	* Bits of allocated IDs are set, free ones are clear.
551	* The NIL id (0) is marked allocated. */
552	uint32_t bmChunkId[(GMM_CHUNKID_LAST + 1 + 31) / 32];
553	} GMM;
554	/** Pointer to the GMM instance. */
555	typedef GMM *PGMM;
556
557	/** The value of GMM::u32Magic (Katsuhiro Otomo). */
558	#define GMM_MAGIC UINT32_C(0x19540414)
559
560
561	/**
562	* GMM chunk mutex state.
563	*
564	* This is returned by gmmR0ChunkMutexAcquire and is used by the other
565	* gmmR0ChunkMutex* methods.
566	*/
567	typedef struct GMMR0CHUNKMTXSTATE
568	{
569	PGMM pGMM;
570	/** The index of the chunk mutex. */
571	uint8_t iChunkMtx;
572	/** The relevant flags (GMMR0CHUNK_MTX_XXX). */
573	uint8_t fFlags;
574	} GMMR0CHUNKMTXSTATE;
575	/** Pointer to a chunk mutex state. */
576	typedef GMMR0CHUNKMTXSTATE *PGMMR0CHUNKMTXSTATE;
577
578	/** @name GMMR0CHUNK_MTX_XXX
579	* @{ */
580	#define GMMR0CHUNK_MTX_INVALID UINT32_C(0)
581	#define GMMR0CHUNK_MTX_KEEP_GIANT UINT32_C(1)
582	#define GMMR0CHUNK_MTX_RETAKE_GIANT UINT32_C(2)
583	#define GMMR0CHUNK_MTX_DROP_GIANT UINT32_C(3)
584	#define GMMR0CHUNK_MTX_END UINT32_C(4)
585	/** @} */
586
587
588	/** The maximum number of shared modules per-vm. */
589	#define GMM_MAX_SHARED_PER_VM_MODULES 2048
590	/** The maximum number of shared modules GMM is allowed to track. */
591	#define GMM_MAX_SHARED_GLOBAL_MODULES 16834
592
593
594	/**
595	* Argument packet for gmmR0SharedModuleCleanup.
596	*/
597	typedef struct GMMR0SHMODPERVMDTORARGS
598	{
599	PGVM pGVM;
600	PGMM pGMM;
601	} GMMR0SHMODPERVMDTORARGS;
602
603	/**
604	* Argument packet for gmmR0CheckSharedModule.
605	*/
606	typedef struct GMMCHECKSHAREDMODULEINFO
607	{
608	PGVM pGVM;
609	VMCPUID idCpu;
610	} GMMCHECKSHAREDMODULEINFO;
611
612
613	/*********************************************************************************************************************************
614	* Global Variables *
615	*********************************************************************************************************************************/
616	/** Pointer to the GMM instance data. */
617	static PGMM g_pGMM = NULL;
618
619	/** Macro for obtaining and validating the g_pGMM pointer.
620	*
621	* On failure it will return from the invoking function with the specified
622	* return value.
623	*
624	* @param pGMM The name of the pGMM variable.
625	* @param rc The return value on failure. Use VERR_GMM_INSTANCE for VBox
626	* status codes.
627	*/
628	#define GMM_GET_VALID_INSTANCE(pGMM, rc) \
629	do { \
630	(pGMM) = g_pGMM; \
631	AssertPtrReturn((pGMM), (rc)); \
632	AssertMsgReturn((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic), (rc)); \
633	} while (0)
634
635	/** Macro for obtaining and validating the g_pGMM pointer, void function
636	* variant.
637	*
638	* On failure it will return from the invoking function.
639	*
640	* @param pGMM The name of the pGMM variable.
641	*/
642	#define GMM_GET_VALID_INSTANCE_VOID(pGMM) \
643	do { \
644	(pGMM) = g_pGMM; \
645	AssertPtrReturnVoid((pGMM)); \
646	AssertMsgReturnVoid((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic)); \
647	} while (0)
648
649
650	/** @def GMM_CHECK_SANITY_UPON_ENTERING
651	* Checks the sanity of the GMM instance data before making changes.
652	*
653	* This is macro is a stub by default and must be enabled manually in the code.
654	*
655	* @returns true if sane, false if not.
656	* @param pGMM The name of the pGMM variable.
657	*/
658	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
659	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (RT_LIKELY(gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0))
660	#else
661	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (true)
662	#endif
663
664	/** @def GMM_CHECK_SANITY_UPON_LEAVING
665	* Checks the sanity of the GMM instance data after making changes.
666	*
667	* This is macro is a stub by default and must be enabled manually in the code.
668	*
669	* @returns true if sane, false if not.
670	* @param pGMM The name of the pGMM variable.
671	*/
672	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
673	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
674	#else
675	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (true)
676	#endif
677
678	/** @def GMM_CHECK_SANITY_IN_LOOPS
679	* Checks the sanity of the GMM instance in the allocation loops.
680	*
681	* This is macro is a stub by default and must be enabled manually in the code.
682	*
683	* @returns true if sane, false if not.
684	* @param pGMM The name of the pGMM variable.
685	*/
686	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
687	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
688	#else
689	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (true)
690	#endif
691
692
693	/*********************************************************************************************************************************
694	* Internal Functions *
695	*********************************************************************************************************************************/
696	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM);
697	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
698	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk);
699	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet);
700	DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
701	#ifdef GMMR0_WITH_SANITY_CHECK
702	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo);
703	#endif
704	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem);
705	DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
706	DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
707	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
708	#ifdef VBOX_WITH_PAGE_SHARING
709	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM);
710	# ifdef VBOX_STRICT
711	static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage);
712	# endif
713	#endif
714
715
716
717	/**
718	* Initializes the GMM component.
719	*
720	* This is called when the VMMR0.r0 module is loaded and protected by the
721	* loader semaphore.
722	*
723	* @returns VBox status code.
724	*/
725	GMMR0DECL(int) GMMR0Init(void)
726	{
727	LogFlow(("GMMInit:\n"));
728
729	/*
730	* Allocate the instance data and the locks.
731	*/
732	PGMM pGMM = (PGMM)RTMemAllocZ(sizeof(*pGMM));
733	if (!pGMM)
734	return VERR_NO_MEMORY;
735
736	pGMM->u32Magic = GMM_MAGIC;
737	for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
738	pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
739	RTListInit(&pGMM->ChunkList);
740	ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
741
742	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
743	int rc = RTCritSectInit(&pGMM->GiantCritSect);
744	#else
745	int rc = RTSemFastMutexCreate(&pGMM->hMtx);
746	#endif
747	if (RT_SUCCESS(rc))
748	{
749	unsigned iMtx;
750	for (iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
751	{
752	rc = RTSemFastMutexCreate(&pGMM->aChunkMtx[iMtx].hMtx);
753	if (RT_FAILURE(rc))
754	break;
755	}
756	pGMM->hSpinLockTree = NIL_RTSPINLOCK;
757	if (RT_SUCCESS(rc))
758	rc = RTSpinlockCreate(&pGMM->hSpinLockTree, RTSPINLOCK_FLAGS_INTERRUPT_SAFE, "gmm-chunk-tree");
759	pGMM->hSpinLockChunkId = NIL_RTSPINLOCK;
760	if (RT_SUCCESS(rc))
761	rc = RTSpinlockCreate(&pGMM->hSpinLockChunkId, RTSPINLOCK_FLAGS_INTERRUPT_SAFE, "gmm-chunk-id");
762	if (RT_SUCCESS(rc))
763	{
764	/*
765	* Figure out how we're going to allocate stuff (only applicable to
766	* host with linear physical memory mappings).
767	*/
768	pGMM->fBoundMemoryMode = false;
769	#ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
770	pGMM->fHasWorkingAllocPhysNC = false;
771
772	RTR0MEMOBJ hMemObj;
773	rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
774	if (RT_SUCCESS(rc))
775	{
776	rc = RTR0MemObjFree(hMemObj, true);
777	AssertRC(rc);
778	pGMM->fHasWorkingAllocPhysNC = true;
779	}
780	else if (rc != VERR_NOT_SUPPORTED)
781	SUPR0Printf("GMMR0Init: Warning! RTR0MemObjAllocPhysNC(, %u, NIL_RTHCPHYS) -> %d!\n", GMM_CHUNK_SIZE, rc);
782	# endif
783
784	/*
785	* Query system page count and guess a reasonable cMaxPages value.
786	*/
787	pGMM->cMaxPages = UINT32_MAX; /** @todo IPRT function for query ram size and such. */
788
789	/*
790	* The idFreeGeneration value should be set so we actually trigger the
791	* wrap-around invalidation handling during a typical test run.
792	*/
793	pGMM->idFreeGeneration = UINT64_MAX / 4 - 128;
794
795	g_pGMM = pGMM;
796	#ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
797	LogFlow(("GMMInit: pGMM=%p fBoundMemoryMode=%RTbool fHasWorkingAllocPhysNC=%RTbool\n", pGMM, pGMM->fBoundMemoryMode, pGMM->fHasWorkingAllocPhysNC));
798	#else
799	LogFlow(("GMMInit: pGMM=%p fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fBoundMemoryMode));
800	#endif
801	return VINF_SUCCESS;
802	}
803
804	/*
805	* Bail out.
806	*/
807	RTSpinlockDestroy(pGMM->hSpinLockChunkId);
808	RTSpinlockDestroy(pGMM->hSpinLockTree);
809	while (iMtx-- > 0)
810	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
811	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
812	RTCritSectDelete(&pGMM->GiantCritSect);
813	#else
814	RTSemFastMutexDestroy(pGMM->hMtx);
815	#endif
816	}
817
818	pGMM->u32Magic = 0;
819	RTMemFree(pGMM);
820	SUPR0Printf("GMMR0Init: failed! rc=%d\n", rc);
821	return rc;
822	}
823
824
825	/**
826	* Terminates the GMM component.
827	*/
828	GMMR0DECL(void) GMMR0Term(void)
829	{
830	LogFlow(("GMMTerm:\n"));
831
832	/*
833	* Take care / be paranoid...
834	*/
835	PGMM pGMM = g_pGMM;
836	if (!RT_VALID_PTR(pGMM))
837	return;
838	if (pGMM->u32Magic != GMM_MAGIC)
839	{
840	SUPR0Printf("GMMR0Term: u32Magic=%#x\n", pGMM->u32Magic);
841	return;
842	}
843
844	/*
845	* Undo what init did and free all the resources we've acquired.
846	*/
847	/* Destroy the fundamentals. */
848	g_pGMM = NULL;
849	pGMM->u32Magic = ~GMM_MAGIC;
850	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
851	RTCritSectDelete(&pGMM->GiantCritSect);
852	#else
853	RTSemFastMutexDestroy(pGMM->hMtx);
854	pGMM->hMtx = NIL_RTSEMFASTMUTEX;
855	#endif
856	RTSpinlockDestroy(pGMM->hSpinLockTree);
857	pGMM->hSpinLockTree = NIL_RTSPINLOCK;
858	RTSpinlockDestroy(pGMM->hSpinLockChunkId);
859	pGMM->hSpinLockChunkId = NIL_RTSPINLOCK;
860
861	/* Free any chunks still hanging around. */
862	RTAvlU32Destroy(&pGMM->pChunks, gmmR0TermDestroyChunk, pGMM);
863
864	/* Destroy the chunk locks. */
865	for (unsigned iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
866	{
867	Assert(pGMM->aChunkMtx[iMtx].cUsers == 0);
868	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
869	pGMM->aChunkMtx[iMtx].hMtx = NIL_RTSEMFASTMUTEX;
870	}
871
872	/* Finally the instance data itself. */
873	RTMemFree(pGMM);
874	LogFlow(("GMMTerm: done\n"));
875	}
876
877
878	/**
879	* RTAvlU32Destroy callback.
880	*
881	* @returns 0
882	* @param pNode The node to destroy.
883	* @param pvGMM The GMM handle.
884	*/
885	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM)
886	{
887	PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
888
889	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
890	SUPR0Printf("GMMR0Term: %RKv/%#x: cFree=%d cPrivate=%d cShared=%d cMappings=%d\n", pChunk,
891	pChunk->Core.Key, pChunk->cFree, pChunk->cPrivate, pChunk->cShared, pChunk->cMappingsX);
892
893	int rc = RTR0MemObjFree(pChunk->hMemObj, true /* fFreeMappings */);
894	if (RT_FAILURE(rc))
895	{
896	SUPR0Printf("GMMR0Term: %RKv/%#x: RTRMemObjFree(%RKv,true) -> %d (cMappings=%d)\n", pChunk,
897	pChunk->Core.Key, pChunk->hMemObj, rc, pChunk->cMappingsX);
898	AssertRC(rc);
899	}
900	pChunk->hMemObj = NIL_RTR0MEMOBJ;
901
902	RTMemFree(pChunk->paMappingsX);
903	pChunk->paMappingsX = NULL;
904
905	RTMemFree(pChunk);
906	NOREF(pvGMM);
907	return 0;
908	}
909
910
911	/**
912	* Initializes the per-VM data for the GMM.
913	*
914	* This is called from within the GVMM lock (from GVMMR0CreateVM)
915	* and should only initialize the data members so GMMR0CleanupVM
916	* can deal with them. We reserve no memory or anything here,
917	* that's done later in GMMR0InitVM.
918	*
919	* @param pGVM Pointer to the Global VM structure.
920	*/
921	GMMR0DECL(int) GMMR0InitPerVMData(PGVM pGVM)
922	{
923	AssertCompile(RT_SIZEOFMEMB(GVM,gmm.s) <= RT_SIZEOFMEMB(GVM,gmm.padding));
924
925	pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
926	pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
927	pGVM->gmm.s.Stats.fMayAllocate = false;
928
929	pGVM->gmm.s.hChunkTlbSpinLock = NIL_RTSPINLOCK;
930	int rc = RTSpinlockCreate(&pGVM->gmm.s.hChunkTlbSpinLock, RTSPINLOCK_FLAGS_INTERRUPT_SAFE, "per-vm-chunk-tlb");
931	AssertRCReturn(rc, rc);
932
933	return VINF_SUCCESS;
934	}
935
936
937	/**
938	* Acquires the GMM giant lock.
939	*
940	* @returns Assert status code from RTSemFastMutexRequest.
941	* @param pGMM Pointer to the GMM instance.
942	*/
943	static int gmmR0MutexAcquire(PGMM pGMM)
944	{
945	ASMAtomicIncU32(&pGMM->cMtxContenders);
946	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
947	int rc = RTCritSectEnter(&pGMM->GiantCritSect);
948	#else
949	int rc = RTSemFastMutexRequest(pGMM->hMtx);
950	#endif
951	ASMAtomicDecU32(&pGMM->cMtxContenders);
952	AssertRC(rc);
953	#ifdef VBOX_STRICT
954	pGMM->hMtxOwner = RTThreadNativeSelf();
955	#endif
956	return rc;
957	}
958
959
960	/**
961	* Releases the GMM giant lock.
962	*
963	* @returns Assert status code from RTSemFastMutexRequest.
964	* @param pGMM Pointer to the GMM instance.
965	*/
966	static int gmmR0MutexRelease(PGMM pGMM)
967	{
968	#ifdef VBOX_STRICT
969	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
970	#endif
971	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
972	int rc = RTCritSectLeave(&pGMM->GiantCritSect);
973	#else
974	int rc = RTSemFastMutexRelease(pGMM->hMtx);
975	AssertRC(rc);
976	#endif
977	return rc;
978	}
979
980
981	/**
982	* Yields the GMM giant lock if there is contention and a certain minimum time
983	* has elapsed since we took it.
984	*
985	* @returns @c true if the mutex was yielded, @c false if not.
986	* @param pGMM Pointer to the GMM instance.
987	* @param puLockNanoTS Where the lock acquisition time stamp is kept
988	* (in/out).
989	*/
990	static bool gmmR0MutexYield(PGMM pGMM, uint64_t *puLockNanoTS)
991	{
992	/*
993	* If nobody is contending the mutex, don't bother checking the time.
994	*/
995	if (ASMAtomicReadU32(&pGMM->cMtxContenders) == 0)
996	return false;
997
998	/*
999	* Don't yield if we haven't executed for at least 2 milliseconds.
1000	*/
1001	uint64_t uNanoNow = RTTimeSystemNanoTS();
1002	if (uNanoNow - *puLockNanoTS < UINT32_C(2000000))
1003	return false;
1004
1005	/*
1006	* Yield the mutex.
1007	*/
1008	#ifdef VBOX_STRICT
1009	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
1010	#endif
1011	ASMAtomicIncU32(&pGMM->cMtxContenders);
1012	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1013	int rc1 = RTCritSectLeave(&pGMM->GiantCritSect); AssertRC(rc1);
1014	#else
1015	int rc1 = RTSemFastMutexRelease(pGMM->hMtx); AssertRC(rc1);
1016	#endif
1017
1018	RTThreadYield();
1019
1020	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1021	int rc2 = RTCritSectEnter(&pGMM->GiantCritSect); AssertRC(rc2);
1022	#else
1023	int rc2 = RTSemFastMutexRequest(pGMM->hMtx); AssertRC(rc2);
1024	#endif
1025	*puLockNanoTS = RTTimeSystemNanoTS();
1026	ASMAtomicDecU32(&pGMM->cMtxContenders);
1027	#ifdef VBOX_STRICT
1028	pGMM->hMtxOwner = RTThreadNativeSelf();
1029	#endif
1030
1031	return true;
1032	}
1033
1034
1035	/**
1036	* Acquires a chunk lock.
1037	*
1038	* The caller must own the giant lock.
1039	*
1040	* @returns Assert status code from RTSemFastMutexRequest.
1041	* @param pMtxState The chunk mutex state info. (Avoids
1042	* passing the same flags and stuff around
1043	* for subsequent release and drop-giant
1044	* calls.)
1045	* @param pGMM Pointer to the GMM instance.
1046	* @param pChunk Pointer to the chunk.
1047	* @param fFlags Flags regarding the giant lock, GMMR0CHUNK_MTX_XXX.
1048	*/
1049	static int gmmR0ChunkMutexAcquire(PGMMR0CHUNKMTXSTATE pMtxState, PGMM pGMM, PGMMCHUNK pChunk, uint32_t fFlags)
1050	{
1051	Assert(fFlags > GMMR0CHUNK_MTX_INVALID && fFlags < GMMR0CHUNK_MTX_END);
1052	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1053
1054	pMtxState->pGMM = pGMM;
1055	pMtxState->fFlags = (uint8_t)fFlags;
1056
1057	/*
1058	* Get the lock index and reference the lock.
1059	*/
1060	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1061	uint32_t iChunkMtx = pChunk->iChunkMtx;
1062	if (iChunkMtx == UINT8_MAX)
1063	{
1064	iChunkMtx = pGMM->iNextChunkMtx++;
1065	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1066
1067	/* Try get an unused one... */
1068	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1069	{
1070	iChunkMtx = pGMM->iNextChunkMtx++;
1071	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1072	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1073	{
1074	iChunkMtx = pGMM->iNextChunkMtx++;
1075	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1076	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1077	{
1078	iChunkMtx = pGMM->iNextChunkMtx++;
1079	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1080	}
1081	}
1082	}
1083
1084	pChunk->iChunkMtx = iChunkMtx;
1085	}
1086	AssertCompile(RT_ELEMENTS(pGMM->aChunkMtx) < UINT8_MAX);
1087	pMtxState->iChunkMtx = (uint8_t)iChunkMtx;
1088	ASMAtomicIncU32(&pGMM->aChunkMtx[iChunkMtx].cUsers);
1089
1090	/*
1091	* Drop the giant?
1092	*/
1093	if (fFlags != GMMR0CHUNK_MTX_KEEP_GIANT)
1094	{
1095	/** @todo GMM life cycle cleanup (we may race someone
1096	* destroying and cleaning up GMM)? */
1097	gmmR0MutexRelease(pGMM);
1098	}
1099
1100	/*
1101	* Take the chunk mutex.
1102	*/
1103	int rc = RTSemFastMutexRequest(pGMM->aChunkMtx[iChunkMtx].hMtx);
1104	AssertRC(rc);
1105	return rc;
1106	}
1107
1108
1109	/**
1110	* Releases the GMM giant lock.
1111	*
1112	* @returns Assert status code from RTSemFastMutexRequest.
1113	* @param pMtxState Pointer to the chunk mutex state.
1114	* @param pChunk Pointer to the chunk if it's still
1115	* alive, NULL if it isn't. This is used to deassociate
1116	* the chunk from the mutex on the way out so a new one
1117	* can be selected next time, thus avoiding contented
1118	* mutexes.
1119	*/
1120	static int gmmR0ChunkMutexRelease(PGMMR0CHUNKMTXSTATE pMtxState, PGMMCHUNK pChunk)
1121	{
1122	PGMM pGMM = pMtxState->pGMM;
1123
1124	/*
1125	* Release the chunk mutex and reacquire the giant if requested.
1126	*/
1127	int rc = RTSemFastMutexRelease(pGMM->aChunkMtx[pMtxState->iChunkMtx].hMtx);
1128	AssertRC(rc);
1129	if (pMtxState->fFlags == GMMR0CHUNK_MTX_RETAKE_GIANT)
1130	rc = gmmR0MutexAcquire(pGMM);
1131	else
1132	Assert((pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT) == (pGMM->hMtxOwner == RTThreadNativeSelf()));
1133
1134	/*
1135	* Drop the chunk mutex user reference and deassociate it from the chunk
1136	* when possible.
1137	*/
1138	if ( ASMAtomicDecU32(&pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers) == 0
1139	&& pChunk
1140	&& RT_SUCCESS(rc) )
1141	{
1142	if (pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT)
1143	pChunk->iChunkMtx = UINT8_MAX;
1144	else
1145	{
1146	rc = gmmR0MutexAcquire(pGMM);
1147	if (RT_SUCCESS(rc))
1148	{
1149	if (pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers == 0)
1150	pChunk->iChunkMtx = UINT8_MAX;
1151	rc = gmmR0MutexRelease(pGMM);
1152	}
1153	}
1154	}
1155
1156	pMtxState->pGMM = NULL;
1157	return rc;
1158	}
1159
1160
1161	/**
1162	* Drops the giant GMM lock we kept in gmmR0ChunkMutexAcquire while keeping the
1163	* chunk locked.
1164	*
1165	* This only works if gmmR0ChunkMutexAcquire was called with
1166	* GMMR0CHUNK_MTX_KEEP_GIANT. gmmR0ChunkMutexRelease will retake the giant
1167	* mutex, i.e. behave as if GMMR0CHUNK_MTX_RETAKE_GIANT was used.
1168	*
1169	* @returns VBox status code (assuming success is ok).
1170	* @param pMtxState Pointer to the chunk mutex state.
1171	*/
1172	static int gmmR0ChunkMutexDropGiant(PGMMR0CHUNKMTXSTATE pMtxState)
1173	{
1174	AssertReturn(pMtxState->fFlags == GMMR0CHUNK_MTX_KEEP_GIANT, VERR_GMM_MTX_FLAGS);
1175	Assert(pMtxState->pGMM->hMtxOwner == RTThreadNativeSelf());
1176	pMtxState->fFlags = GMMR0CHUNK_MTX_RETAKE_GIANT;
1177	/** @todo GMM life cycle cleanup (we may race someone
1178	* destroying and cleaning up GMM)? */
1179	return gmmR0MutexRelease(pMtxState->pGMM);
1180	}
1181
1182
1183	/**
1184	* For experimenting with NUMA affinity and such.
1185	*
1186	* @returns The current NUMA Node ID.
1187	*/
1188	static uint16_t gmmR0GetCurrentNumaNodeId(void)
1189	{
1190	#if 1
1191	return GMM_CHUNK_NUMA_ID_UNKNOWN;
1192	#else
1193	return RTMpCpuId() / 16;
1194	#endif
1195	}
1196
1197
1198
1199	/**
1200	* Cleans up when a VM is terminating.
1201	*
1202	* @param pGVM Pointer to the Global VM structure.
1203	*/
1204	GMMR0DECL(void) GMMR0CleanupVM(PGVM pGVM)
1205	{
1206	LogFlow(("GMMR0CleanupVM: pGVM=%p:{.hSelf=%#x}\n", pGVM, pGVM->hSelf));
1207
1208	PGMM pGMM;
1209	GMM_GET_VALID_INSTANCE_VOID(pGMM);
1210
1211	#ifdef VBOX_WITH_PAGE_SHARING
1212	/*
1213	* Clean up all registered shared modules first.
1214	*/
1215	gmmR0SharedModuleCleanup(pGMM, pGVM);
1216	#endif
1217
1218	gmmR0MutexAcquire(pGMM);
1219	uint64_t uLockNanoTS = RTTimeSystemNanoTS();
1220	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
1221
1222	/*
1223	* The policy is 'INVALID' until the initial reservation
1224	* request has been serviced.
1225	*/
1226	if ( pGVM->gmm.s.Stats.enmPolicy > GMMOCPOLICY_INVALID
1227	&& pGVM->gmm.s.Stats.enmPolicy < GMMOCPOLICY_END)
1228	{
1229	/*
1230	* If it's the last VM around, we can skip walking all the chunk looking
1231	* for the pages owned by this VM and instead flush the whole shebang.
1232	*
1233	* This takes care of the eventuality that a VM has left shared page
1234	* references behind (shouldn't happen of course, but you never know).
1235	*/
1236	Assert(pGMM->cRegisteredVMs);
1237	pGMM->cRegisteredVMs--;
1238
1239	/*
1240	* Walk the entire pool looking for pages that belong to this VM
1241	* and leftover mappings. (This'll only catch private pages,
1242	* shared pages will be 'left behind'.)
1243	*/
1244	/** @todo r=bird: This scanning+freeing could be optimized in bound mode! */
1245	uint64_t cPrivatePages = pGVM->gmm.s.Stats.cPrivatePages; /* save */
1246
1247	unsigned iCountDown = 64;
1248	bool fRedoFromStart;
1249	PGMMCHUNK pChunk;
1250	do
1251	{
1252	fRedoFromStart = false;
1253	RTListForEachReverse(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
1254	{
1255	uint32_t const cFreeChunksOld = pGMM->cFreedChunks;
1256	if ( ( !pGMM->fBoundMemoryMode
1257	\|\| pChunk->hGVM == pGVM->hSelf)
1258	&& gmmR0CleanupVMScanChunk(pGMM, pGVM, pChunk))
1259	{
1260	/* We left the giant mutex, so reset the yield counters. */
1261	uLockNanoTS = RTTimeSystemNanoTS();
1262	iCountDown = 64;
1263	}
1264	else
1265	{
1266	/* Didn't leave it, so do normal yielding. */
1267	if (!iCountDown)
1268	gmmR0MutexYield(pGMM, &uLockNanoTS);
1269	else
1270	iCountDown--;
1271	}
1272	if (pGMM->cFreedChunks != cFreeChunksOld)
1273	{
1274	fRedoFromStart = true;
1275	break;
1276	}
1277	}
1278	} while (fRedoFromStart);
1279
1280	if (pGVM->gmm.s.Stats.cPrivatePages)
1281	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x has %#x private pages that cannot be found!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cPrivatePages);
1282
1283	pGMM->cAllocatedPages -= cPrivatePages;
1284
1285	/*
1286	* Free empty chunks.
1287	*/
1288	PGMMCHUNKFREESET pPrivateSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
1289	do
1290	{
1291	fRedoFromStart = false;
1292	iCountDown = 10240;
1293	pChunk = pPrivateSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
1294	while (pChunk)
1295	{
1296	PGMMCHUNK pNext = pChunk->pFreeNext;
1297	Assert(pChunk->cFree == GMM_CHUNK_NUM_PAGES);
1298	if ( !pGMM->fBoundMemoryMode
1299	\|\| pChunk->hGVM == pGVM->hSelf)
1300	{
1301	uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1302	if (gmmR0FreeChunk(pGMM, pGVM, pChunk, true /fRelaxedSem/))
1303	{
1304	/* We've left the giant mutex, restart? (+1 for our unlink) */
1305	fRedoFromStart = pPrivateSet->idGeneration != idGenerationOld + 1;
1306	if (fRedoFromStart)
1307	break;
1308	uLockNanoTS = RTTimeSystemNanoTS();
1309	iCountDown = 10240;
1310	}
1311	}
1312
1313	/* Advance and maybe yield the lock. */
1314	pChunk = pNext;
1315	if (--iCountDown == 0)
1316	{
1317	uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1318	fRedoFromStart = gmmR0MutexYield(pGMM, &uLockNanoTS)
1319	&& pPrivateSet->idGeneration != idGenerationOld;
1320	if (fRedoFromStart)
1321	break;
1322	iCountDown = 10240;
1323	}
1324	}
1325	} while (fRedoFromStart);
1326
1327	/*
1328	* Account for shared pages that weren't freed.
1329	*/
1330	if (pGVM->gmm.s.Stats.cSharedPages)
1331	{
1332	Assert(pGMM->cSharedPages >= pGVM->gmm.s.Stats.cSharedPages);
1333	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x left %#x shared pages behind!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cSharedPages);
1334	pGMM->cLeftBehindSharedPages += pGVM->gmm.s.Stats.cSharedPages;
1335	}
1336
1337	/*
1338	* Clean up balloon statistics in case the VM process crashed.
1339	*/
1340	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
1341	pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
1342
1343	/*
1344	* Update the over-commitment management statistics.
1345	*/
1346	pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1347	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
1348	+ pGVM->gmm.s.Stats.Reserved.cShadowPages;
1349	switch (pGVM->gmm.s.Stats.enmPolicy)
1350	{
1351	case GMMOCPOLICY_NO_OC:
1352	break;
1353	default:
1354	/** @todo Update GMM->cOverCommittedPages */
1355	break;
1356	}
1357	}
1358
1359	/* zap the GVM data. */
1360	pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
1361	pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
1362	pGVM->gmm.s.Stats.fMayAllocate = false;
1363
1364	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1365	gmmR0MutexRelease(pGMM);
1366
1367	/*
1368	* Destroy the spinlock.
1369	*/
1370	RTSPINLOCK hSpinlock = NIL_RTSPINLOCK;
1371	ASMAtomicXchgHandle(&pGVM->gmm.s.hChunkTlbSpinLock, NIL_RTSPINLOCK, &hSpinlock);
1372	RTSpinlockDestroy(hSpinlock);
1373
1374	LogFlow(("GMMR0CleanupVM: returns\n"));
1375	}
1376
1377
1378	/**
1379	* Scan one chunk for private pages belonging to the specified VM.
1380	*
1381	* @note This function may drop the giant mutex!
1382	*
1383	* @returns @c true if we've temporarily dropped the giant mutex, @c false if
1384	* we didn't.
1385	* @param pGMM Pointer to the GMM instance.
1386	* @param pGVM The global VM handle.
1387	* @param pChunk The chunk to scan.
1388	*/
1389	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1390	{
1391	Assert(!pGMM->fBoundMemoryMode \|\| pChunk->hGVM == pGVM->hSelf);
1392
1393	/*
1394	* Look for pages belonging to the VM.
1395	* (Perform some internal checks while we're scanning.)
1396	*/
1397	#ifndef VBOX_STRICT
1398	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
1399	#endif
1400	{
1401	unsigned cPrivate = 0;
1402	unsigned cShared = 0;
1403	unsigned cFree = 0;
1404
1405	gmmR0UnlinkChunk(pChunk); /* avoiding cFreePages updates. */
1406
1407	uint16_t hGVM = pGVM->hSelf;
1408	unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
1409	while (iPage-- > 0)
1410	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
1411	{
1412	if (pChunk->aPages[iPage].Private.hGVM == hGVM)
1413	{
1414	/*
1415	* Free the page.
1416	*
1417	* The reason for not using gmmR0FreePrivatePage here is that we
1418	* must not cause the chunk to be freed from under us - we're in
1419	* an AVL tree walk here.
1420	*/
1421	pChunk->aPages[iPage].u = 0;
1422	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1423	pChunk->aPages[iPage].Free.fZeroed = false;
1424	pChunk->aPages[iPage].Free.iNext = pChunk->iFreeHead;
1425	pChunk->iFreeHead = iPage;
1426	pChunk->cPrivate--;
1427	pChunk->cFree++;
1428	pGVM->gmm.s.Stats.cPrivatePages--;
1429	cFree++;
1430	}
1431	else
1432	cPrivate++;
1433	}
1434	else if (GMM_PAGE_IS_FREE(&pChunk->aPages[iPage]))
1435	cFree++;
1436	else
1437	cShared++;
1438
1439	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1440
1441	/*
1442	* Did it add up?
1443	*/
1444	if (RT_UNLIKELY( pChunk->cFree != cFree
1445	\|\| pChunk->cPrivate != cPrivate
1446	\|\| pChunk->cShared != cShared))
1447	{
1448	SUPR0Printf("gmmR0CleanupVMScanChunk: Chunk %RKv/%#x has bogus stats - free=%d/%d private=%d/%d shared=%d/%d\n",
1449	pChunk, pChunk->Core.Key, pChunk->cFree, cFree, pChunk->cPrivate, cPrivate, pChunk->cShared, cShared);
1450	pChunk->cFree = cFree;
1451	pChunk->cPrivate = cPrivate;
1452	pChunk->cShared = cShared;
1453	}
1454	}
1455
1456	/*
1457	* If not in bound memory mode, we should reset the hGVM field
1458	* if it has our handle in it.
1459	*/
1460	if (pChunk->hGVM == pGVM->hSelf)
1461	{
1462	if (!g_pGMM->fBoundMemoryMode)
1463	pChunk->hGVM = NIL_GVM_HANDLE;
1464	else if (pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1465	{
1466	SUPR0Printf("gmmR0CleanupVMScanChunk: %RKv/%#x: cFree=%#x - it should be 0 in bound mode!\n",
1467	pChunk, pChunk->Core.Key, pChunk->cFree);
1468	AssertMsgFailed(("%p/%#x: cFree=%#x - it should be 0 in bound mode!\n", pChunk, pChunk->Core.Key, pChunk->cFree));
1469
1470	gmmR0UnlinkChunk(pChunk);
1471	pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1472	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1473	}
1474	}
1475
1476	/*
1477	* Look for a mapping belonging to the terminating VM.
1478	*/
1479	GMMR0CHUNKMTXSTATE MtxState;
1480	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
1481	unsigned cMappings = pChunk->cMappingsX;
1482	for (unsigned i = 0; i < cMappings; i++)
1483	if (pChunk->paMappingsX[i].pGVM == pGVM)
1484	{
1485	gmmR0ChunkMutexDropGiant(&MtxState);
1486
1487	RTR0MEMOBJ hMemObj = pChunk->paMappingsX[i].hMapObj;
1488
1489	cMappings--;
1490	if (i < cMappings)
1491	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
1492	pChunk->paMappingsX[cMappings].pGVM = NULL;
1493	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
1494	Assert(pChunk->cMappingsX - 1U == cMappings);
1495	pChunk->cMappingsX = cMappings;
1496
1497	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings (NA) */);
1498	if (RT_FAILURE(rc))
1499	{
1500	SUPR0Printf("gmmR0CleanupVMScanChunk: %RKv/%#x: mapping #%x: RTRMemObjFree(%RKv,false) -> %d \n",
1501	pChunk, pChunk->Core.Key, i, hMemObj, rc);
1502	AssertRC(rc);
1503	}
1504
1505	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1506	return true;
1507	}
1508
1509	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1510	return false;
1511	}
1512
1513
1514	/**
1515	* The initial resource reservations.
1516	*
1517	* This will make memory reservations according to policy and priority. If there aren't
1518	* sufficient resources available to sustain the VM this function will fail and all
1519	* future allocations requests will fail as well.
1520	*
1521	* These are just the initial reservations made very very early during the VM creation
1522	* process and will be adjusted later in the GMMR0UpdateReservation call after the
1523	* ring-3 init has completed.
1524	*
1525	* @returns VBox status code.
1526	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1527	* @retval VERR_GMM_
1528	*
1529	* @param pGVM The global (ring-0) VM structure.
1530	* @param idCpu The VCPU id - must be zero.
1531	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1532	* This does not include MMIO2 and similar.
1533	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1534	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1535	* hyper heap, MMIO2 and similar.
1536	* @param enmPolicy The OC policy to use on this VM.
1537	* @param enmPriority The priority in an out-of-memory situation.
1538	*
1539	* @thread The creator thread / EMT(0).
1540	*/
1541	GMMR0DECL(int) GMMR0InitialReservation(PGVM pGVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages,
1542	uint32_t cFixedPages, GMMOCPOLICY enmPolicy, GMMPRIORITY enmPriority)
1543	{
1544	LogFlow(("GMMR0InitialReservation: pGVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x enmPolicy=%d enmPriority=%d\n",
1545	pGVM, cBasePages, cShadowPages, cFixedPages, enmPolicy, enmPriority));
1546
1547	/*
1548	* Validate, get basics and take the semaphore.
1549	*/
1550	AssertReturn(idCpu == 0, VERR_INVALID_CPU_ID);
1551	PGMM pGMM;
1552	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1553	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
1554	if (RT_FAILURE(rc))
1555	return rc;
1556
1557	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1558	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1559	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1560	AssertReturn(enmPolicy > GMMOCPOLICY_INVALID && enmPolicy < GMMOCPOLICY_END, VERR_INVALID_PARAMETER);
1561	AssertReturn(enmPriority > GMMPRIORITY_INVALID && enmPriority < GMMPRIORITY_END, VERR_INVALID_PARAMETER);
1562
1563	gmmR0MutexAcquire(pGMM);
1564	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1565	{
1566	if ( !pGVM->gmm.s.Stats.Reserved.cBasePages
1567	&& !pGVM->gmm.s.Stats.Reserved.cFixedPages
1568	&& !pGVM->gmm.s.Stats.Reserved.cShadowPages)
1569	{
1570	/*
1571	* Check if we can accommodate this.
1572	*/
1573	/* ... later ... */
1574	if (RT_SUCCESS(rc))
1575	{
1576	/*
1577	* Update the records.
1578	*/
1579	pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1580	pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1581	pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1582	pGVM->gmm.s.Stats.enmPolicy = enmPolicy;
1583	pGVM->gmm.s.Stats.enmPriority = enmPriority;
1584	pGVM->gmm.s.Stats.fMayAllocate = true;
1585
1586	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1587	pGMM->cRegisteredVMs++;
1588	}
1589	}
1590	else
1591	rc = VERR_WRONG_ORDER;
1592	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1593	}
1594	else
1595	rc = VERR_GMM_IS_NOT_SANE;
1596	gmmR0MutexRelease(pGMM);
1597	LogFlow(("GMMR0InitialReservation: returns %Rrc\n", rc));
1598	return rc;
1599	}
1600
1601
1602	/**
1603	* VMMR0 request wrapper for GMMR0InitialReservation.
1604	*
1605	* @returns see GMMR0InitialReservation.
1606	* @param pGVM The global (ring-0) VM structure.
1607	* @param idCpu The VCPU id.
1608	* @param pReq Pointer to the request packet.
1609	*/
1610	GMMR0DECL(int) GMMR0InitialReservationReq(PGVM pGVM, VMCPUID idCpu, PGMMINITIALRESERVATIONREQ pReq)
1611	{
1612	/*
1613	* Validate input and pass it on.
1614	*/
1615	AssertPtrReturn(pGVM, VERR_INVALID_POINTER);
1616	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1617	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1618
1619	return GMMR0InitialReservation(pGVM, idCpu, pReq->cBasePages, pReq->cShadowPages,
1620	pReq->cFixedPages, pReq->enmPolicy, pReq->enmPriority);
1621	}
1622
1623
1624	/**
1625	* This updates the memory reservation with the additional MMIO2 and ROM pages.
1626	*
1627	* @returns VBox status code.
1628	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1629	*
1630	* @param pGVM The global (ring-0) VM structure.
1631	* @param idCpu The VCPU id.
1632	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1633	* This does not include MMIO2 and similar.
1634	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1635	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1636	* hyper heap, MMIO2 and similar.
1637	*
1638	* @thread EMT(idCpu)
1639	*/
1640	GMMR0DECL(int) GMMR0UpdateReservation(PGVM pGVM, VMCPUID idCpu, uint64_t cBasePages,
1641	uint32_t cShadowPages, uint32_t cFixedPages)
1642	{
1643	LogFlow(("GMMR0UpdateReservation: pGVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x\n",
1644	pGVM, cBasePages, cShadowPages, cFixedPages));
1645
1646	/*
1647	* Validate, get basics and take the semaphore.
1648	*/
1649	PGMM pGMM;
1650	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1651	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
1652	if (RT_FAILURE(rc))
1653	return rc;
1654
1655	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1656	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1657	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1658
1659	gmmR0MutexAcquire(pGMM);
1660	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1661	{
1662	if ( pGVM->gmm.s.Stats.Reserved.cBasePages
1663	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
1664	&& pGVM->gmm.s.Stats.Reserved.cShadowPages)
1665	{
1666	/*
1667	* Check if we can accommodate this.
1668	*/
1669	/* ... later ... */
1670	if (RT_SUCCESS(rc))
1671	{
1672	/*
1673	* Update the records.
1674	*/
1675	pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1676	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
1677	+ pGVM->gmm.s.Stats.Reserved.cShadowPages;
1678	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1679
1680	pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1681	pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1682	pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1683	}
1684	}
1685	else
1686	rc = VERR_WRONG_ORDER;
1687	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1688	}
1689	else
1690	rc = VERR_GMM_IS_NOT_SANE;
1691	gmmR0MutexRelease(pGMM);
1692	LogFlow(("GMMR0UpdateReservation: returns %Rrc\n", rc));
1693	return rc;
1694	}
1695
1696
1697	/**
1698	* VMMR0 request wrapper for GMMR0UpdateReservation.
1699	*
1700	* @returns see GMMR0UpdateReservation.
1701	* @param pGVM The global (ring-0) VM structure.
1702	* @param idCpu The VCPU id.
1703	* @param pReq Pointer to the request packet.
1704	*/
1705	GMMR0DECL(int) GMMR0UpdateReservationReq(PGVM pGVM, VMCPUID idCpu, PGMMUPDATERESERVATIONREQ pReq)
1706	{
1707	/*
1708	* Validate input and pass it on.
1709	*/
1710	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1711	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1712
1713	return GMMR0UpdateReservation(pGVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages);
1714	}
1715
1716	#ifdef GMMR0_WITH_SANITY_CHECK
1717
1718	/**
1719	* Performs sanity checks on a free set.
1720	*
1721	* @returns Error count.
1722	*
1723	* @param pGMM Pointer to the GMM instance.
1724	* @param pSet Pointer to the set.
1725	* @param pszSetName The set name.
1726	* @param pszFunction The function from which it was called.
1727	* @param uLine The line number.
1728	*/
1729	static uint32_t gmmR0SanityCheckSet(PGMM pGMM, PGMMCHUNKFREESET pSet, const char *pszSetName,
1730	const char *pszFunction, unsigned uLineNo)
1731	{
1732	uint32_t cErrors = 0;
1733
1734	/*
1735	* Count the free pages in all the chunks and match it against pSet->cFreePages.
1736	*/
1737	uint32_t cPages = 0;
1738	for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1739	{
1740	for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1741	{
1742	/** @todo check that the chunk is hash into the right set. */
1743	cPages += pCur->cFree;
1744	}
1745	}
1746	if (RT_UNLIKELY(cPages != pSet->cFreePages))
1747	{
1748	SUPR0Printf("GMM insanity: found %#x pages in the %s set, expected %#x. (%s, line %u)\n",
1749	cPages, pszSetName, pSet->cFreePages, pszFunction, uLineNo);
1750	cErrors++;
1751	}
1752
1753	return cErrors;
1754	}
1755
1756
1757	/**
1758	* Performs some sanity checks on the GMM while owning lock.
1759	*
1760	* @returns Error count.
1761	*
1762	* @param pGMM Pointer to the GMM instance.
1763	* @param pszFunction The function from which it is called.
1764	* @param uLineNo The line number.
1765	*/
1766	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo)
1767	{
1768	uint32_t cErrors = 0;
1769
1770	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->PrivateX, "private", pszFunction, uLineNo);
1771	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Shared, "shared", pszFunction, uLineNo);
1772	/** @todo add more sanity checks. */
1773
1774	return cErrors;
1775	}
1776
1777	#endif /* GMMR0_WITH_SANITY_CHECK */
1778
1779	/**
1780	* Looks up a chunk in the tree and fill in the TLB entry for it.
1781	*
1782	* This is not expected to fail and will bitch if it does.
1783	*
1784	* @returns Pointer to the allocation chunk, NULL if not found.
1785	* @param pGMM Pointer to the GMM instance.
1786	* @param idChunk The ID of the chunk to find.
1787	* @param pTlbe Pointer to the TLB entry.
1788	*
1789	* @note Caller owns spinlock.
1790	*/
1791	static PGMMCHUNK gmmR0GetChunkSlow(PGMM pGMM, uint32_t idChunk, PGMMCHUNKTLBE pTlbe)
1792	{
1793	PGMMCHUNK pChunk = (PGMMCHUNK)RTAvlU32Get(&pGMM->pChunks, idChunk);
1794	AssertMsgReturn(pChunk, ("Chunk %#x not found!\n", idChunk), NULL);
1795	pTlbe->idChunk = idChunk;
1796	pTlbe->pChunk = pChunk;
1797	return pChunk;
1798	}
1799
1800
1801	/**
1802	* Finds a allocation chunk, spin-locked.
1803	*
1804	* This is not expected to fail and will bitch if it does.
1805	*
1806	* @returns Pointer to the allocation chunk, NULL if not found.
1807	* @param pGMM Pointer to the GMM instance.
1808	* @param idChunk The ID of the chunk to find.
1809	*/
1810	DECLINLINE(PGMMCHUNK) gmmR0GetChunkLocked(PGMM pGMM, uint32_t idChunk)
1811	{
1812	/*
1813	* Do a TLB lookup, branch if not in the TLB.
1814	*/
1815	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(idChunk)];
1816	PGMMCHUNK pChunk = pTlbe->pChunk;
1817	if ( pChunk == NULL
1818	\|\| pTlbe->idChunk != idChunk)
1819	pChunk = gmmR0GetChunkSlow(pGMM, idChunk, pTlbe);
1820	return pChunk;
1821	}
1822
1823
1824	/**
1825	* Finds a allocation chunk.
1826	*
1827	* This is not expected to fail and will bitch if it does.
1828	*
1829	* @returns Pointer to the allocation chunk, NULL if not found.
1830	* @param pGMM Pointer to the GMM instance.
1831	* @param idChunk The ID of the chunk to find.
1832	*/
1833	DECLINLINE(PGMMCHUNK) gmmR0GetChunk(PGMM pGMM, uint32_t idChunk)
1834	{
1835	RTSpinlockAcquire(pGMM->hSpinLockTree);
1836	PGMMCHUNK pChunk = gmmR0GetChunkLocked(pGMM, idChunk);
1837	RTSpinlockRelease(pGMM->hSpinLockTree);
1838	return pChunk;
1839	}
1840
1841
1842	/**
1843	* Finds a page.
1844	*
1845	* This is not expected to fail and will bitch if it does.
1846	*
1847	* @returns Pointer to the page, NULL if not found.
1848	* @param pGMM Pointer to the GMM instance.
1849	* @param idPage The ID of the page to find.
1850	*/
1851	DECLINLINE(PGMMPAGE) gmmR0GetPage(PGMM pGMM, uint32_t idPage)
1852	{
1853	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1854	if (RT_LIKELY(pChunk))
1855	return &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
1856	return NULL;
1857	}
1858
1859
1860	#if 0 /* unused */
1861	/**
1862	* Gets the host physical address for a page given by it's ID.
1863	*
1864	* @returns The host physical address or NIL_RTHCPHYS.
1865	* @param pGMM Pointer to the GMM instance.
1866	* @param idPage The ID of the page to find.
1867	*/
1868	DECLINLINE(RTHCPHYS) gmmR0GetPageHCPhys(PGMM pGMM, uint32_t idPage)
1869	{
1870	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1871	if (RT_LIKELY(pChunk))
1872	return RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, idPage & GMM_PAGEID_IDX_MASK);
1873	return NIL_RTHCPHYS;
1874	}
1875	#endif /* unused */
1876
1877
1878	/**
1879	* Selects the appropriate free list given the number of free pages.
1880	*
1881	* @returns Free list index.
1882	* @param cFree The number of free pages in the chunk.
1883	*/
1884	DECLINLINE(unsigned) gmmR0SelectFreeSetList(unsigned cFree)
1885	{
1886	unsigned iList = cFree >> GMM_CHUNK_FREE_SET_SHIFT;
1887	AssertMsg(iList < RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists) / RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists[0]),
1888	("%d (%u)\n", iList, cFree));
1889	return iList;
1890	}
1891
1892
1893	/**
1894	* Unlinks the chunk from the free list it's currently on (if any).
1895	*
1896	* @param pChunk The allocation chunk.
1897	*/
1898	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk)
1899	{
1900	PGMMCHUNKFREESET pSet = pChunk->pSet;
1901	if (RT_LIKELY(pSet))
1902	{
1903	pSet->cFreePages -= pChunk->cFree;
1904	pSet->idGeneration++;
1905
1906	PGMMCHUNK pPrev = pChunk->pFreePrev;
1907	PGMMCHUNK pNext = pChunk->pFreeNext;
1908	if (pPrev)
1909	pPrev->pFreeNext = pNext;
1910	else
1911	pSet->apLists[gmmR0SelectFreeSetList(pChunk->cFree)] = pNext;
1912	if (pNext)
1913	pNext->pFreePrev = pPrev;
1914
1915	pChunk->pSet = NULL;
1916	pChunk->pFreeNext = NULL;
1917	pChunk->pFreePrev = NULL;
1918	}
1919	else
1920	{
1921	Assert(!pChunk->pFreeNext);
1922	Assert(!pChunk->pFreePrev);
1923	Assert(!pChunk->cFree);
1924	}
1925	}
1926
1927
1928	/**
1929	* Links the chunk onto the appropriate free list in the specified free set.
1930	*
1931	* If no free entries, it's not linked into any list.
1932	*
1933	* @param pChunk The allocation chunk.
1934	* @param pSet The free set.
1935	*/
1936	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet)
1937	{
1938	Assert(!pChunk->pSet);
1939	Assert(!pChunk->pFreeNext);
1940	Assert(!pChunk->pFreePrev);
1941
1942	if (pChunk->cFree > 0)
1943	{
1944	pChunk->pSet = pSet;
1945	pChunk->pFreePrev = NULL;
1946	unsigned const iList = gmmR0SelectFreeSetList(pChunk->cFree);
1947	pChunk->pFreeNext = pSet->apLists[iList];
1948	if (pChunk->pFreeNext)
1949	pChunk->pFreeNext->pFreePrev = pChunk;
1950	pSet->apLists[iList] = pChunk;
1951
1952	pSet->cFreePages += pChunk->cFree;
1953	pSet->idGeneration++;
1954	}
1955	}
1956
1957
1958	/**
1959	* Links the chunk onto the appropriate free list in the specified free set.
1960	*
1961	* If no free entries, it's not linked into any list.
1962	*
1963	* @param pGMM Pointer to the GMM instance.
1964	* @param pGVM Pointer to the kernel-only VM instace data.
1965	* @param pChunk The allocation chunk.
1966	*/
1967	DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1968	{
1969	PGMMCHUNKFREESET pSet;
1970	if (pGMM->fBoundMemoryMode)
1971	pSet = &pGVM->gmm.s.Private;
1972	else if (pChunk->cShared)
1973	pSet = &pGMM->Shared;
1974	else
1975	pSet = &pGMM->PrivateX;
1976	gmmR0LinkChunk(pChunk, pSet);
1977	}
1978
1979
1980	/**
1981	* Frees a Chunk ID.
1982	*
1983	* @param pGMM Pointer to the GMM instance.
1984	* @param idChunk The Chunk ID to free.
1985	*/
1986	static void gmmR0FreeChunkId(PGMM pGMM, uint32_t idChunk)
1987	{
1988	AssertReturnVoid(idChunk != NIL_GMM_CHUNKID);
1989	RTSpinlockAcquire(pGMM->hSpinLockChunkId); /* We could probably skip the locking here, I think. */
1990
1991	AssertMsg(ASMBitTest(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk));
1992	ASMAtomicBitClear(&pGMM->bmChunkId[0], idChunk);
1993
1994	RTSpinlockRelease(pGMM->hSpinLockChunkId);
1995	}
1996
1997
1998	/**
1999	* Allocates a new Chunk ID.
2000	*
2001	* @returns The Chunk ID.
2002	* @param pGMM Pointer to the GMM instance.
2003	*/
2004	static uint32_t gmmR0AllocateChunkId(PGMM pGMM)
2005	{
2006	AssertCompile(!((GMM_CHUNKID_LAST + 1) & 31)); /* must be a multiple of 32 */
2007	AssertCompile(NIL_GMM_CHUNKID == 0);
2008
2009	RTSpinlockAcquire(pGMM->hSpinLockChunkId);
2010
2011	/*
2012	* Try the next sequential one.
2013	*/
2014	int32_t idChunk = ++pGMM->idChunkPrev;
2015	if ( (uint32_t)idChunk <= GMM_CHUNKID_LAST
2016	&& idChunk > NIL_GMM_CHUNKID)
2017	{
2018	if (!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk))
2019	{
2020	RTSpinlockRelease(pGMM->hSpinLockChunkId);
2021	return idChunk;
2022	}
2023
2024	/*
2025	* Scan sequentially from the last one.
2026	*/
2027	if ((uint32_t)idChunk < GMM_CHUNKID_LAST)
2028	{
2029	idChunk = ASMBitNextClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1, idChunk);
2030	if ( idChunk > NIL_GMM_CHUNKID
2031	&& (uint32_t)idChunk <= GMM_CHUNKID_LAST)
2032	{
2033	AssertMsgReturnStmt(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk),
2034	RTSpinlockRelease(pGMM->hSpinLockChunkId), NIL_GMM_CHUNKID);
2035
2036	pGMM->idChunkPrev = idChunk;
2037	RTSpinlockRelease(pGMM->hSpinLockChunkId);
2038	return idChunk;
2039	}
2040	}
2041	}
2042
2043	/*
2044	* Ok, scan from the start.
2045	* We're not racing anyone, so there is no need to expect failures or have restart loops.
2046	*/
2047	idChunk = ASMBitFirstClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1);
2048	AssertMsgReturnStmt(idChunk > NIL_GMM_CHUNKID && (uint32_t)idChunk <= GMM_CHUNKID_LAST, ("%#x\n", idChunk),
2049	RTSpinlockRelease(pGMM->hSpinLockChunkId), NIL_GVM_HANDLE);
2050	AssertMsgReturnStmt(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk),
2051	RTSpinlockRelease(pGMM->hSpinLockChunkId), NIL_GMM_CHUNKID);
2052
2053	pGMM->idChunkPrev = idChunk;
2054	RTSpinlockRelease(pGMM->hSpinLockChunkId);
2055	return idChunk;
2056	}
2057
2058
2059	/**
2060	* Allocates one private page.
2061	*
2062	* Worker for gmmR0AllocatePages.
2063	*
2064	* @param pChunk The chunk to allocate it from.
2065	* @param hGVM The GVM handle of the VM requesting memory.
2066	* @param pPageDesc The page descriptor.
2067	*/
2068	static void gmmR0AllocatePage(PGMMCHUNK pChunk, uint32_t hGVM, PGMMPAGEDESC pPageDesc)
2069	{
2070	/* update the chunk stats. */
2071	if (pChunk->hGVM == NIL_GVM_HANDLE)
2072	pChunk->hGVM = hGVM;
2073	Assert(pChunk->cFree);
2074	pChunk->cFree--;
2075	pChunk->cPrivate++;
2076
2077	/* unlink the first free page. */
2078	const uint32_t iPage = pChunk->iFreeHead;
2079	AssertReleaseMsg(iPage < RT_ELEMENTS(pChunk->aPages), ("%d\n", iPage));
2080	PGMMPAGE pPage = &pChunk->aPages[iPage];
2081	Assert(GMM_PAGE_IS_FREE(pPage));
2082	pChunk->iFreeHead = pPage->Free.iNext;
2083	Log3(("A pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x iNext=%#x\n",
2084	pPage, iPage, (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage,
2085	pPage->Common.u2State, pChunk->iFreeHead, pPage->Free.iNext));
2086
2087	bool const fZeroed = pPage->Free.fZeroed;
2088
2089	/* make the page private. */
2090	pPage->u = 0;
2091	AssertCompile(GMM_PAGE_STATE_PRIVATE == 0);
2092	pPage->Private.hGVM = hGVM;
2093	AssertCompile(NIL_RTHCPHYS >= GMM_GCPHYS_LAST);
2094	AssertCompile(GMM_GCPHYS_UNSHAREABLE >= GMM_GCPHYS_LAST);
2095	if (pPageDesc->HCPhysGCPhys <= GMM_GCPHYS_LAST)
2096	pPage->Private.pfn = pPageDesc->HCPhysGCPhys >> PAGE_SHIFT;
2097	else
2098	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
2099
2100	/* update the page descriptor. */
2101	pPageDesc->idSharedPage = NIL_GMM_PAGEID;
2102	pPageDesc->idPage = (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage;
2103	RTHCPHYS const HCPhys = RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, iPage);
2104	Assert(HCPhys != NIL_RTHCPHYS); Assert(HCPhys < NIL_GMMPAGEDESC_PHYS);
2105	pPageDesc->HCPhysGCPhys = HCPhys;
2106	pPageDesc->fZeroed = fZeroed;
2107	}
2108
2109
2110	/**
2111	* Picks the free pages from a chunk.
2112	*
2113	* @returns The new page descriptor table index.
2114	* @param pChunk The chunk.
2115	* @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2116	* affinity.
2117	* @param iPage The current page descriptor table index.
2118	* @param cPages The total number of pages to allocate.
2119	* @param paPages The page descriptor table (input + ouput).
2120	*/
2121	static uint32_t gmmR0AllocatePagesFromChunk(PGMMCHUNK pChunk, uint16_t const hGVM, uint32_t iPage, uint32_t cPages,
2122	PGMMPAGEDESC paPages)
2123	{
2124	PGMMCHUNKFREESET pSet = pChunk->pSet; Assert(pSet);
2125	gmmR0UnlinkChunk(pChunk);
2126
2127	for (; pChunk->cFree && iPage < cPages; iPage++)
2128	gmmR0AllocatePage(pChunk, hGVM, &paPages[iPage]);
2129
2130	gmmR0LinkChunk(pChunk, pSet);
2131	return iPage;
2132	}
2133
2134
2135	/**
2136	* Registers a new chunk of memory.
2137	*
2138	* This is called by gmmR0AllocateOneChunk and GMMR0AllocateLargePage.
2139	*
2140	* In the GMMR0AllocateLargePage case the GMM_CHUNK_FLAGS_LARGE_PAGE flag is
2141	* set and the chunk will be registered as fully allocated to save time.
2142	*
2143	* @returns VBox status code. On success, the giant GMM lock will be held, the
2144	* caller must release it (ugly).
2145	* @param pGMM Pointer to the GMM instance.
2146	* @param pSet Pointer to the set.
2147	* @param hMemObj The memory object for the chunk.
2148	* @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2149	* affinity.
2150	* @param pSession Same as @a hGVM.
2151	* @param fChunkFlags The chunk flags, GMM_CHUNK_FLAGS_XXX.
2152	* @param cPages The number of pages requested. Zero for large pages.
2153	* @param paPages The page descriptor table (input + output). NULL for
2154	* large pages.
2155	* @param piPage The pointer to the page descriptor table index variable.
2156	* This will be updated. NULL for large pages.
2157	* @param ppChunk Chunk address (out).
2158	*
2159	* @remarks The caller must not own the giant GMM mutex.
2160	* The giant GMM mutex will be acquired and returned acquired in
2161	* the success path. On failure, no locks will be held.
2162	*/
2163	static int gmmR0RegisterChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, RTR0MEMOBJ hMemObj, uint16_t hGVM, PSUPDRVSESSION pSession,
2164	uint16_t fChunkFlags, uint32_t cPages, PGMMPAGEDESC paPages, uint32_t piPage, PGMMCHUNK ppChunk)
2165	{
2166	/*
2167	* Validate input & state.
2168	*/
2169	Assert(pGMM->hMtxOwner != RTThreadNativeSelf());
2170	Assert(hGVM != NIL_GVM_HANDLE \|\| pGMM->fBoundMemoryMode);
2171	Assert(fChunkFlags == 0 \|\| fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE);
2172	if (!(fChunkFlags &= GMM_CHUNK_FLAGS_LARGE_PAGE))
2173	{
2174	AssertPtr(paPages);
2175	AssertPtr(piPage);
2176	Assert(cPages > 0);
2177	Assert(cPages > *piPage);
2178	}
2179	else
2180	{
2181	Assert(cPages == 0);
2182	Assert(!paPages);
2183	Assert(!piPage);
2184	}
2185
2186	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2187	/*
2188	* Get a ring-0 mapping of the object.
2189	*/
2190	uint8_t pbMapping = (uint8_t )RTR0MemObjAddress(hMemObj);
2191	if (!pbMapping)
2192	{
2193	RTR0MEMOBJ hMapObj;
2194	int rc = RTR0MemObjMapKernel(&hMapObj, hMemObj, (void *)-1, 0, RTMEM_PROT_READ \| RTMEM_PROT_WRITE);
2195	if (RT_SUCCESS(rc))
2196	pbMapping = (uint8_t *)RTR0MemObjAddress(hMapObj);
2197	else
2198	return rc;
2199	AssertPtr(pbMapping);
2200	}
2201	#endif
2202
2203	/*
2204	* Allocate a chunk and an ID for it.
2205	*/
2206	int rc;
2207	PGMMCHUNK pChunk = (PGMMCHUNK)RTMemAllocZ(sizeof(*pChunk));
2208	if (pChunk)
2209	{
2210	pChunk->Core.Key = gmmR0AllocateChunkId(pGMM);
2211	if ( pChunk->Core.Key != NIL_GMM_CHUNKID
2212	&& pChunk->Core.Key <= GMM_CHUNKID_LAST)
2213	{
2214	/*
2215	* Initialize it.
2216	*/
2217	pChunk->hMemObj = hMemObj;
2218	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2219	pChunk->pbMapping = pbMapping;
2220	#endif
2221	pChunk->hGVM = hGVM;
2222	pChunk->idNumaNode = gmmR0GetCurrentNumaNodeId();
2223	pChunk->iChunkMtx = UINT8_MAX;
2224	pChunk->fFlags = fChunkFlags;
2225	pChunk->uidOwner = pSession ? SUPR0GetSessionUid(pSession) : NIL_RTUID;
2226	/pChunk->cShared = 0; /
2227
2228	uint32_t const iDstPageFirst = piPage ? *piPage : cPages;
2229	if (!(fChunkFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
2230	{
2231	/*
2232	* Allocate the requested number of pages from the start of the chunk,
2233	* queue the rest (if any) on the free list.
2234	*/
2235	uint32_t const cPagesAlloc = RT_MIN(cPages - iDstPageFirst, GMM_CHUNK_NUM_PAGES);
2236	pChunk->cPrivate = cPagesAlloc;
2237	pChunk->cFree = GMM_CHUNK_NUM_PAGES - cPagesAlloc;
2238	pChunk->iFreeHead = GMM_CHUNK_NUM_PAGES > cPagesAlloc ? cPagesAlloc : UINT16_MAX;
2239
2240	/* Alloc pages: */
2241	uint32_t const idPageChunk = pChunk->Core.Key << GMM_CHUNKID_SHIFT;
2242	uint32_t iDstPage = iDstPageFirst;
2243	uint32_t iPage;
2244	for (iPage = 0; iPage < cPagesAlloc; iPage++, iDstPage++)
2245	{
2246	if (paPages[iDstPage].HCPhysGCPhys <= GMM_GCPHYS_LAST)
2247	pChunk->aPages[iPage].Private.pfn = paPages[iDstPage].HCPhysGCPhys >> PAGE_SHIFT;
2248	else
2249	pChunk->aPages[iPage].Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
2250	pChunk->aPages[iPage].Private.hGVM = hGVM;
2251	pChunk->aPages[iPage].Private.u2State = GMM_PAGE_STATE_PRIVATE;
2252
2253	paPages[iDstPage].HCPhysGCPhys = RTR0MemObjGetPagePhysAddr(hMemObj, iPage);
2254	paPages[iDstPage].fZeroed = true;
2255	paPages[iDstPage].idPage = idPageChunk \| iPage;
2256	paPages[iDstPage].idSharedPage = NIL_GMM_PAGEID;
2257	}
2258	*piPage = iDstPage;
2259
2260	/* Build free list: */
2261	if (iPage < RT_ELEMENTS(pChunk->aPages))
2262	{
2263	Assert(pChunk->iFreeHead == iPage);
2264	for (; iPage < RT_ELEMENTS(pChunk->aPages) - 1; iPage++)
2265	{
2266	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
2267	pChunk->aPages[iPage].Free.fZeroed = true;
2268	pChunk->aPages[iPage].Free.iNext = iPage + 1;
2269	}
2270	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.u2State = GMM_PAGE_STATE_FREE;
2271	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.fZeroed = true;
2272	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.iNext = UINT16_MAX;
2273	}
2274	else
2275	Assert(pChunk->iFreeHead == UINT16_MAX);
2276	}
2277	else
2278	{
2279	/*
2280	* Large page: Mark all pages as privately allocated (watered down gmmR0AllocatePage).
2281	*/
2282	pChunk->cFree = 0;
2283	pChunk->cPrivate = GMM_CHUNK_NUM_PAGES;
2284	pChunk->iFreeHead = UINT16_MAX;
2285
2286	for (unsigned iPage = 0; iPage < RT_ELEMENTS(pChunk->aPages); iPage++)
2287	{
2288	pChunk->aPages[iPage].Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
2289	pChunk->aPages[iPage].Private.hGVM = hGVM;
2290	pChunk->aPages[iPage].Private.u2State = GMM_PAGE_STATE_PRIVATE;
2291	}
2292	}
2293
2294	/*
2295	* Zero the memory if it wasn't zeroed by the host already.
2296	* This simplifies keeping secret kernel bits from userland and brings
2297	* everyone to the same level wrt allocation zeroing.
2298	*/
2299	rc = VINF_SUCCESS;
2300	if (!RTR0MemObjWasZeroInitialized(hMemObj))
2301	{
2302	#ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2303	if (!(fChunkFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
2304	{
2305	for (uint32_t iPage = 0; iPage < GMM_CHUNK_NUM_PAGES; iPage++)
2306	{
2307	void *pvPage = NULL;
2308	rc = SUPR0HCPhysToVirt(RTR0MemObjGetPagePhysAddr(hMemObj, iPage), &pvPage);
2309	AssertRCBreak(rc);
2310	RT_BZERO(pvPage, PAGE_SIZE);
2311	}
2312	}
2313	else
2314	{
2315	/* Can do the whole large page in one go. */
2316	void *pvPage = NULL;
2317	rc = SUPR0HCPhysToVirt(RTR0MemObjGetPagePhysAddr(hMemObj, 0), &pvPage);
2318	AssertRC(rc);
2319	if (RT_SUCCESS(rc))
2320	RT_BZERO(pvPage, GMM_CHUNK_SIZE);
2321	}
2322	#else
2323	RT_BZERO(pbMapping, GMM_CHUNK_SIZE);
2324	#endif
2325	}
2326	if (RT_SUCCESS(rc))
2327	{
2328	*ppChunk = pChunk;
2329
2330	/*
2331	* Allocate a Chunk ID and insert it into the tree.
2332	* This has to be done behind the mutex of course.
2333	*/
2334	rc = gmmR0MutexAcquire(pGMM);
2335	if (RT_SUCCESS(rc))
2336	{
2337	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2338	{
2339	RTSpinlockAcquire(pGMM->hSpinLockTree);
2340	if (RTAvlU32Insert(&pGMM->pChunks, &pChunk->Core))
2341	{
2342	pGMM->cChunks++;
2343	RTListAppend(&pGMM->ChunkList, &pChunk->ListNode);
2344	RTSpinlockRelease(pGMM->hSpinLockTree);
2345
2346	gmmR0LinkChunk(pChunk, pSet);
2347
2348	LogFlow(("gmmR0RegisterChunk: pChunk=%p id=%#x cChunks=%d\n", pChunk, pChunk->Core.Key, pGMM->cChunks));
2349	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2350	return VINF_SUCCESS;
2351	}
2352
2353	/*
2354	* Bail out.
2355	*/
2356	RTSpinlockRelease(pGMM->hSpinLockTree);
2357	rc = VERR_GMM_CHUNK_INSERT;
2358	}
2359	else
2360	rc = VERR_GMM_IS_NOT_SANE;
2361	gmmR0MutexRelease(pGMM);
2362	}
2363	*ppChunk = NULL;
2364	}
2365
2366	/* Undo any page allocations. */
2367	if (!(fChunkFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
2368	{
2369	uint32_t const cToFree = pChunk->cPrivate;
2370	Assert(*piPage - iDstPageFirst == cToFree);
2371	for (uint32_t iDstPage = iDstPageFirst, iPage = 0; iPage < cToFree; iPage++, iDstPage++)
2372	{
2373	paPages[iDstPageFirst].fZeroed = false;
2374	if (pChunk->aPages[iPage].Private.pfn == GMM_PAGE_PFN_UNSHAREABLE)
2375	paPages[iDstPageFirst].HCPhysGCPhys = NIL_GMMPAGEDESC_PHYS;
2376	else
2377	paPages[iDstPageFirst].HCPhysGCPhys = (RTHCPHYS)pChunk->aPages[iPage].Private.pfn << PAGE_SHIFT;
2378	paPages[iDstPageFirst].idPage = NIL_GMM_PAGEID;
2379	paPages[iDstPageFirst].idSharedPage = NIL_GMM_PAGEID;
2380	}
2381	*piPage = iDstPageFirst;
2382	}
2383
2384	gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
2385	}
2386	else
2387	rc = VERR_GMM_CHUNK_INSERT;
2388	RTMemFree(pChunk);
2389	}
2390	else
2391	rc = VERR_NO_MEMORY;
2392	return rc;
2393	}
2394
2395
2396	/**
2397	* Allocate a new chunk, immediately pick the requested pages from it, and adds
2398	* what's remaining to the specified free set.
2399	*
2400	* @note This will leave the giant mutex while allocating the new chunk!
2401	*
2402	* @returns VBox status code.
2403	* @param pGMM Pointer to the GMM instance data.
2404	* @param pGVM Pointer to the kernel-only VM instace data.
2405	* @param pSet Pointer to the free set.
2406	* @param cPages The number of pages requested.
2407	* @param paPages The page descriptor table (input + output).
2408	* @param piPage The pointer to the page descriptor table index variable.
2409	* This will be updated.
2410	*/
2411	static int gmmR0AllocateChunkNew(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet, uint32_t cPages,
2412	PGMMPAGEDESC paPages, uint32_t *piPage)
2413	{
2414	gmmR0MutexRelease(pGMM);
2415
2416	RTR0MEMOBJ hMemObj;
2417	int rc;
2418	#ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2419	if (pGMM->fHasWorkingAllocPhysNC)
2420	rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2421	else
2422	#endif
2423	rc = RTR0MemObjAllocPage(&hMemObj, GMM_CHUNK_SIZE, false /fExecutable/);
2424	if (RT_SUCCESS(rc))
2425	{
2426	PGMMCHUNK pIgnored;
2427	rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, pGVM->pSession, 0 /fChunkFlags/,
2428	cPages, paPages, piPage, &pIgnored);
2429	if (RT_SUCCESS(rc))
2430	return VINF_SUCCESS;
2431
2432	/* bail out */
2433	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
2434	}
2435
2436	int rc2 = gmmR0MutexAcquire(pGMM);
2437	AssertRCReturn(rc2, RT_FAILURE(rc) ? rc : rc2);
2438	return rc;
2439
2440	}
2441
2442
2443	/**
2444	* As a last restort we'll pick any page we can get.
2445	*
2446	* @returns The new page descriptor table index.
2447	* @param pSet The set to pick from.
2448	* @param pGVM Pointer to the global VM structure.
2449	* @param uidSelf The UID of the caller.
2450	* @param iPage The current page descriptor table index.
2451	* @param cPages The total number of pages to allocate.
2452	* @param paPages The page descriptor table (input + ouput).
2453	*/
2454	static uint32_t gmmR0AllocatePagesIndiscriminately(PGMMCHUNKFREESET pSet, PGVM pGVM, RTUID uidSelf,
2455	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2456	{
2457	unsigned iList = RT_ELEMENTS(pSet->apLists);
2458	while (iList-- > 0)
2459	{
2460	PGMMCHUNK pChunk = pSet->apLists[iList];
2461	while (pChunk)
2462	{
2463	PGMMCHUNK pNext = pChunk->pFreeNext;
2464	if ( pChunk->uidOwner == uidSelf
2465	\|\| ( pChunk->cMappingsX == 0
2466	&& pChunk->cFree == (GMM_CHUNK_SIZE >> PAGE_SHIFT)))
2467	{
2468	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2469	if (iPage >= cPages)
2470	return iPage;
2471	}
2472
2473	pChunk = pNext;
2474	}
2475	}
2476	return iPage;
2477	}
2478
2479
2480	/**
2481	* Pick pages from empty chunks on the same NUMA node.
2482	*
2483	* @returns The new page descriptor table index.
2484	* @param pSet The set to pick from.
2485	* @param pGVM Pointer to the global VM structure.
2486	* @param uidSelf The UID of the caller.
2487	* @param iPage The current page descriptor table index.
2488	* @param cPages The total number of pages to allocate.
2489	* @param paPages The page descriptor table (input + ouput).
2490	*/
2491	static uint32_t gmmR0AllocatePagesFromEmptyChunksOnSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM, RTUID uidSelf,
2492	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2493	{
2494	PGMMCHUNK pChunk = pSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
2495	if (pChunk)
2496	{
2497	uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2498	while (pChunk)
2499	{
2500	PGMMCHUNK pNext = pChunk->pFreeNext;
2501
2502	if ( pChunk->idNumaNode == idNumaNode
2503	&& ( pChunk->uidOwner == uidSelf
2504	\|\| pChunk->cMappingsX == 0))
2505	{
2506	pChunk->hGVM = pGVM->hSelf;
2507	pChunk->uidOwner = uidSelf;
2508	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2509	if (iPage >= cPages)
2510	{
2511	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2512	return iPage;
2513	}
2514	}
2515
2516	pChunk = pNext;
2517	}
2518	}
2519	return iPage;
2520	}
2521
2522
2523	/**
2524	* Pick pages from non-empty chunks on the same NUMA node.
2525	*
2526	* @returns The new page descriptor table index.
2527	* @param pSet The set to pick from.
2528	* @param pGVM Pointer to the global VM structure.
2529	* @param uidSelf The UID of the caller.
2530	* @param iPage The current page descriptor table index.
2531	* @param cPages The total number of pages to allocate.
2532	* @param paPages The page descriptor table (input + ouput).
2533	*/
2534	static uint32_t gmmR0AllocatePagesFromSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM, RTUID const uidSelf,
2535	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2536	{
2537	/** @todo start by picking from chunks with about the right size first? */
2538	uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2539	unsigned iList = GMM_CHUNK_FREE_SET_UNUSED_LIST;
2540	while (iList-- > 0)
2541	{
2542	PGMMCHUNK pChunk = pSet->apLists[iList];
2543	while (pChunk)
2544	{
2545	PGMMCHUNK pNext = pChunk->pFreeNext;
2546
2547	if ( pChunk->idNumaNode == idNumaNode
2548	&& pChunk->uidOwner == uidSelf)
2549	{
2550	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2551	if (iPage >= cPages)
2552	{
2553	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2554	return iPage;
2555	}
2556	}
2557
2558	pChunk = pNext;
2559	}
2560	}
2561	return iPage;
2562	}
2563
2564
2565	/**
2566	* Pick pages that are in chunks already associated with the VM.
2567	*
2568	* @returns The new page descriptor table index.
2569	* @param pGMM Pointer to the GMM instance data.
2570	* @param pGVM Pointer to the global VM structure.
2571	* @param pSet The set to pick from.
2572	* @param iPage The current page descriptor table index.
2573	* @param cPages The total number of pages to allocate.
2574	* @param paPages The page descriptor table (input + ouput).
2575	*/
2576	static uint32_t gmmR0AllocatePagesAssociatedWithVM(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet,
2577	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2578	{
2579	uint16_t const hGVM = pGVM->hSelf;
2580
2581	/* Hint. */
2582	if (pGVM->gmm.s.idLastChunkHint != NIL_GMM_CHUNKID)
2583	{
2584	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pGVM->gmm.s.idLastChunkHint);
2585	if (pChunk && pChunk->cFree)
2586	{
2587	iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2588	if (iPage >= cPages)
2589	return iPage;
2590	}
2591	}
2592
2593	/* Scan. */
2594	for (unsigned iList = 0; iList < RT_ELEMENTS(pSet->apLists); iList++)
2595	{
2596	PGMMCHUNK pChunk = pSet->apLists[iList];
2597	while (pChunk)
2598	{
2599	PGMMCHUNK pNext = pChunk->pFreeNext;
2600
2601	if (pChunk->hGVM == hGVM)
2602	{
2603	iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2604	if (iPage >= cPages)
2605	{
2606	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2607	return iPage;
2608	}
2609	}
2610
2611	pChunk = pNext;
2612	}
2613	}
2614	return iPage;
2615	}
2616
2617
2618
2619	/**
2620	* Pick pages in bound memory mode.
2621	*
2622	* @returns The new page descriptor table index.
2623	* @param pGVM Pointer to the global VM structure.
2624	* @param iPage The current page descriptor table index.
2625	* @param cPages The total number of pages to allocate.
2626	* @param paPages The page descriptor table (input + ouput).
2627	*/
2628	static uint32_t gmmR0AllocatePagesInBoundMode(PGVM pGVM, uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2629	{
2630	for (unsigned iList = 0; iList < RT_ELEMENTS(pGVM->gmm.s.Private.apLists); iList++)
2631	{
2632	PGMMCHUNK pChunk = pGVM->gmm.s.Private.apLists[iList];
2633	while (pChunk)
2634	{
2635	Assert(pChunk->hGVM == pGVM->hSelf);
2636	PGMMCHUNK pNext = pChunk->pFreeNext;
2637	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2638	if (iPage >= cPages)
2639	return iPage;
2640	pChunk = pNext;
2641	}
2642	}
2643	return iPage;
2644	}
2645
2646
2647	/**
2648	* Checks if we should start picking pages from chunks of other VMs because
2649	* we're getting close to the system memory or reserved limit.
2650	*
2651	* @returns @c true if we should, @c false if we should first try allocate more
2652	* chunks.
2653	*/
2654	static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(PGVM pGVM)
2655	{
2656	/*
2657	* Don't allocate a new chunk if we're
2658	*/
2659	uint64_t cPgReserved = pGVM->gmm.s.Stats.Reserved.cBasePages
2660	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
2661	- pGVM->gmm.s.Stats.cBalloonedPages
2662	/** @todo what about shared pages? */;
2663	uint64_t cPgAllocated = pGVM->gmm.s.Stats.Allocated.cBasePages
2664	+ pGVM->gmm.s.Stats.Allocated.cFixedPages;
2665	uint64_t cPgDelta = cPgReserved - cPgAllocated;
2666	if (cPgDelta < GMM_CHUNK_NUM_PAGES * 4)
2667	return true;
2668	/** @todo make the threshold configurable, also test the code to see if
2669	* this ever kicks in (we might be reserving too much or smth). */
2670
2671	/*
2672	* Check how close we're to the max memory limit and how many fragments
2673	* there are?...
2674	*/
2675	/** @todo */
2676
2677	return false;
2678	}
2679
2680
2681	/**
2682	* Checks if we should start picking pages from chunks of other VMs because
2683	* there is a lot of free pages around.
2684	*
2685	* @returns @c true if we should, @c false if we should first try allocate more
2686	* chunks.
2687	*/
2688	static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(PGMM pGMM)
2689	{
2690	/*
2691	* Setting the limit at 16 chunks (32 MB) at the moment.
2692	*/
2693	if (pGMM->PrivateX.cFreePages >= GMM_CHUNK_NUM_PAGES * 16)
2694	return true;
2695	return false;
2696	}
2697
2698
2699	/**
2700	* Common worker for GMMR0AllocateHandyPages and GMMR0AllocatePages.
2701	*
2702	* @returns VBox status code:
2703	* @retval VINF_SUCCESS on success.
2704	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2705	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2706	* that is we're trying to allocate more than we've reserved.
2707	*
2708	* @param pGMM Pointer to the GMM instance data.
2709	* @param pGVM Pointer to the VM.
2710	* @param cPages The number of pages to allocate.
2711	* @param paPages Pointer to the page descriptors. See GMMPAGEDESC for
2712	* details on what is expected on input.
2713	* @param enmAccount The account to charge.
2714	*
2715	* @remarks Caller owns the giant GMM lock.
2716	*/
2717	static int gmmR0AllocatePagesNew(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2718	{
2719	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
2720
2721	/*
2722	* Check allocation limits.
2723	*/
2724	if (RT_LIKELY(pGMM->cAllocatedPages + cPages <= pGMM->cMaxPages))
2725	{ /* likely */ }
2726	else
2727	return VERR_GMM_HIT_GLOBAL_LIMIT;
2728
2729	switch (enmAccount)
2730	{
2731	case GMMACCOUNT_BASE:
2732	if (RT_LIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
2733	<= pGVM->gmm.s.Stats.Reserved.cBasePages))
2734	{ /* likely */ }
2735	else
2736	{
2737	Log(("gmmR0AllocatePages:Base: Reserved=%#llx Allocated+Ballooned+Requested=%#llx+%#llx+%#x!\n",
2738	pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages,
2739	pGVM->gmm.s.Stats.cBalloonedPages, cPages));
2740	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2741	}
2742	break;
2743	case GMMACCOUNT_SHADOW:
2744	if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages + cPages <= pGVM->gmm.s.Stats.Reserved.cShadowPages))
2745	{ /* likely */ }
2746	else
2747	{
2748	Log(("gmmR0AllocatePages:Shadow: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2749	pGVM->gmm.s.Stats.Reserved.cShadowPages, pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
2750	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2751	}
2752	break;
2753	case GMMACCOUNT_FIXED:
2754	if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages + cPages <= pGVM->gmm.s.Stats.Reserved.cFixedPages))
2755	{ /* likely */ }
2756	else
2757	{
2758	Log(("gmmR0AllocatePages:Fixed: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2759	pGVM->gmm.s.Stats.Reserved.cFixedPages, pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
2760	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2761	}
2762	break;
2763	default:
2764	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2765	}
2766
2767	/*
2768	* Update the accounts before we proceed because we might be leaving the
2769	* protection of the global mutex and thus run the risk of permitting
2770	* too much memory to be allocated.
2771	*/
2772	switch (enmAccount)
2773	{
2774	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages += cPages; break;
2775	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages += cPages; break;
2776	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages += cPages; break;
2777	default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2778	}
2779	pGVM->gmm.s.Stats.cPrivatePages += cPages;
2780	pGMM->cAllocatedPages += cPages;
2781
2782	/*
2783	* Bound mode is also relatively straightforward.
2784	*/
2785	uint32_t iPage = 0;
2786	int rc = VINF_SUCCESS;
2787	if (pGMM->fBoundMemoryMode)
2788	{
2789	iPage = gmmR0AllocatePagesInBoundMode(pGVM, iPage, cPages, paPages);
2790	if (iPage < cPages)
2791	do
2792	rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGVM->gmm.s.Private, cPages, paPages, &iPage);
2793	while (iPage < cPages && RT_SUCCESS(rc));
2794	}
2795	/*
2796	* Shared mode is trickier as we should try archive the same locality as
2797	* in bound mode, but smartly make use of non-full chunks allocated by
2798	* other VMs if we're low on memory.
2799	*/
2800	else
2801	{
2802	RTUID const uidSelf = SUPR0GetSessionUid(pGVM->pSession);
2803
2804	/* Pick the most optimal pages first. */
2805	iPage = gmmR0AllocatePagesAssociatedWithVM(pGMM, pGVM, &pGMM->PrivateX, iPage, cPages, paPages);
2806	if (iPage < cPages)
2807	{
2808	/* Maybe we should try getting pages from chunks "belonging" to
2809	other VMs before allocating more chunks? */
2810	bool fTriedOnSameAlready = false;
2811	if (gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(pGVM))
2812	{
2813	iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, uidSelf, iPage, cPages, paPages);
2814	fTriedOnSameAlready = true;
2815	}
2816
2817	/* Allocate memory from empty chunks. */
2818	if (iPage < cPages)
2819	iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->PrivateX, pGVM, uidSelf, iPage, cPages, paPages);
2820
2821	/* Grab empty shared chunks. */
2822	if (iPage < cPages)
2823	iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->Shared, pGVM, uidSelf, iPage, cPages, paPages);
2824
2825	/* If there is a lof of free pages spread around, try not waste
2826	system memory on more chunks. (Should trigger defragmentation.) */
2827	if ( !fTriedOnSameAlready
2828	&& gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(pGMM))
2829	{
2830	iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, uidSelf, iPage, cPages, paPages);
2831	if (iPage < cPages)
2832	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, uidSelf, iPage, cPages, paPages);
2833	}
2834
2835	/*
2836	* Ok, try allocate new chunks.
2837	*/
2838	if (iPage < cPages)
2839	{
2840	do
2841	rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGMM->PrivateX, cPages, paPages, &iPage);
2842	while (iPage < cPages && RT_SUCCESS(rc));
2843
2844	#if 0 /* We cannot mix chunks with different UIDs. */
2845	/* If the host is out of memory, take whatever we can get. */
2846	if ( (rc == VERR_NO_MEMORY \|\| rc == VERR_NO_PHYS_MEMORY)
2847	&& pGMM->PrivateX.cFreePages + pGMM->Shared.cFreePages >= cPages - iPage)
2848	{
2849	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2850	if (iPage < cPages)
2851	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2852	AssertRelease(iPage == cPages);
2853	rc = VINF_SUCCESS;
2854	}
2855	#endif
2856	}
2857	}
2858	}
2859
2860	/*
2861	* Clean up on failure. Since this is bound to be a low-memory condition
2862	* we will give back any empty chunks that might be hanging around.
2863	*/
2864	if (RT_SUCCESS(rc))
2865	{ /* likely */ }
2866	else
2867	{
2868	/* Update the statistics. */
2869	pGVM->gmm.s.Stats.cPrivatePages -= cPages;
2870	pGMM->cAllocatedPages -= cPages - iPage;
2871	switch (enmAccount)
2872	{
2873	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages; break;
2874	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= cPages; break;
2875	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= cPages; break;
2876	default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2877	}
2878
2879	/* Release the pages. */
2880	while (iPage-- > 0)
2881	{
2882	uint32_t idPage = paPages[iPage].idPage;
2883	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2884	if (RT_LIKELY(pPage))
2885	{
2886	Assert(GMM_PAGE_IS_PRIVATE(pPage));
2887	Assert(pPage->Private.hGVM == pGVM->hSelf);
2888	gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
2889	}
2890	else
2891	AssertMsgFailed(("idPage=%#x\n", idPage));
2892
2893	paPages[iPage].idPage = NIL_GMM_PAGEID;
2894	paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2895	paPages[iPage].HCPhysGCPhys = NIL_GMMPAGEDESC_PHYS;
2896	paPages[iPage].fZeroed = false;
2897	}
2898
2899	/* Free empty chunks. */
2900	/** @todo */
2901
2902	/* return the fail status on failure */
2903	return rc;
2904	}
2905	return VINF_SUCCESS;
2906	}
2907
2908
2909	/**
2910	* Updates the previous allocations and allocates more pages.
2911	*
2912	* The handy pages are always taken from the 'base' memory account.
2913	* The allocated pages are not cleared and will contains random garbage.
2914	*
2915	* @returns VBox status code:
2916	* @retval VINF_SUCCESS on success.
2917	* @retval VERR_NOT_OWNER if the caller is not an EMT.
2918	* @retval VERR_GMM_PAGE_NOT_FOUND if one of the pages to update wasn't found.
2919	* @retval VERR_GMM_PAGE_NOT_PRIVATE if one of the pages to update wasn't a
2920	* private page.
2921	* @retval VERR_GMM_PAGE_NOT_SHARED if one of the pages to update wasn't a
2922	* shared page.
2923	* @retval VERR_GMM_NOT_PAGE_OWNER if one of the pages to be updated wasn't
2924	* owned by the VM.
2925	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2926	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2927	* that is we're trying to allocate more than we've reserved.
2928	*
2929	* @param pGVM The global (ring-0) VM structure.
2930	* @param idCpu The VCPU id.
2931	* @param cPagesToUpdate The number of pages to update (starting from the head).
2932	* @param cPagesToAlloc The number of pages to allocate (starting from the head).
2933	* @param paPages The array of page descriptors.
2934	* See GMMPAGEDESC for details on what is expected on input.
2935	* @thread EMT(idCpu)
2936	*/
2937	GMMR0DECL(int) GMMR0AllocateHandyPages(PGVM pGVM, VMCPUID idCpu, uint32_t cPagesToUpdate,
2938	uint32_t cPagesToAlloc, PGMMPAGEDESC paPages)
2939	{
2940	LogFlow(("GMMR0AllocateHandyPages: pGVM=%p cPagesToUpdate=%#x cPagesToAlloc=%#x paPages=%p\n",
2941	pGVM, cPagesToUpdate, cPagesToAlloc, paPages));
2942
2943	/*
2944	* Validate, get basics and take the semaphore.
2945	* (This is a relatively busy path, so make predictions where possible.)
2946	*/
2947	PGMM pGMM;
2948	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2949	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
2950	if (RT_FAILURE(rc))
2951	return rc;
2952
2953	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2954	AssertMsgReturn( (cPagesToUpdate && cPagesToUpdate < 1024)
2955	\|\| (cPagesToAlloc && cPagesToAlloc < 1024),
2956	("cPagesToUpdate=%#x cPagesToAlloc=%#x\n", cPagesToUpdate, cPagesToAlloc),
2957	VERR_INVALID_PARAMETER);
2958
2959	unsigned iPage = 0;
2960	for (; iPage < cPagesToUpdate; iPage++)
2961	{
2962	AssertMsgReturn( ( paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2963	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK))
2964	\|\| paPages[iPage].HCPhysGCPhys == NIL_GMMPAGEDESC_PHYS
2965	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE,
2966	("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys),
2967	VERR_INVALID_PARAMETER);
2968	/* ignore fZeroed here */
2969	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2970	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
2971	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2972	AssertMsgReturn( paPages[iPage].idSharedPage == NIL_GMM_PAGEID
2973	\|\| paPages[iPage].idSharedPage <= GMM_PAGEID_LAST,
2974	("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2975	}
2976
2977	for (; iPage < cPagesToAlloc; iPage++)
2978	{
2979	AssertMsgReturn(paPages[iPage].HCPhysGCPhys == NIL_GMMPAGEDESC_PHYS, ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys), VERR_INVALID_PARAMETER);
2980	AssertMsgReturn(paPages[iPage].fZeroed == false, ("#%#x: %#x\n", iPage, paPages[iPage].fZeroed), VERR_INVALID_PARAMETER);
2981	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2982	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2983	}
2984
2985	gmmR0MutexAcquire(pGMM);
2986	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2987	{
2988	/* No allocations before the initial reservation has been made! */
2989	if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
2990	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
2991	&& pGVM->gmm.s.Stats.Reserved.cShadowPages))
2992	{
2993	/*
2994	* Perform the updates.
2995	* Stop on the first error.
2996	*/
2997	for (iPage = 0; iPage < cPagesToUpdate; iPage++)
2998	{
2999	if (paPages[iPage].idPage != NIL_GMM_PAGEID)
3000	{
3001	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idPage);
3002	if (RT_LIKELY(pPage))
3003	{
3004	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
3005	{
3006	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
3007	{
3008	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
3009	if (RT_LIKELY(paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST))
3010	pPage->Private.pfn = paPages[iPage].HCPhysGCPhys >> PAGE_SHIFT;
3011	else if (paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE)
3012	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
3013	/* else: NIL_RTHCPHYS nothing */
3014
3015	paPages[iPage].idPage = NIL_GMM_PAGEID;
3016	paPages[iPage].HCPhysGCPhys = NIL_GMMPAGEDESC_PHYS;
3017	paPages[iPage].fZeroed = false;
3018	}
3019	else
3020	{
3021	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not owner! hGVM=%#x hSelf=%#x\n",
3022	iPage, paPages[iPage].idPage, pPage->Private.hGVM, pGVM->hSelf));
3023	rc = VERR_GMM_NOT_PAGE_OWNER;
3024	break;
3025	}
3026	}
3027	else
3028	{
3029	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not private! %.Rhxs (type %d)\n", iPage, paPages[iPage].idPage, sizeof(pPage), pPage, pPage->Common.u2State));
3030	rc = VERR_GMM_PAGE_NOT_PRIVATE;
3031	break;
3032	}
3033	}
3034	else
3035	{
3036	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (private)\n", iPage, paPages[iPage].idPage));
3037	rc = VERR_GMM_PAGE_NOT_FOUND;
3038	break;
3039	}
3040	}
3041
3042	if (paPages[iPage].idSharedPage == NIL_GMM_PAGEID)
3043	{ /* likely */ }
3044	else
3045	{
3046	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idSharedPage);
3047	if (RT_LIKELY(pPage))
3048	{
3049	if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3050	{
3051	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
3052	Assert(pPage->Shared.cRefs);
3053	Assert(pGVM->gmm.s.Stats.cSharedPages);
3054	Assert(pGVM->gmm.s.Stats.Allocated.cBasePages);
3055
3056	Log(("GMMR0AllocateHandyPages: free shared page %x cRefs=%d\n", paPages[iPage].idSharedPage, pPage->Shared.cRefs));
3057	pGVM->gmm.s.Stats.cSharedPages--;
3058	pGVM->gmm.s.Stats.Allocated.cBasePages--;
3059	if (!--pPage->Shared.cRefs)
3060	gmmR0FreeSharedPage(pGMM, pGVM, paPages[iPage].idSharedPage, pPage);
3061	else
3062	{
3063	Assert(pGMM->cDuplicatePages);
3064	pGMM->cDuplicatePages--;
3065	}
3066
3067	paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
3068	}
3069	else
3070	{
3071	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not shared!\n", iPage, paPages[iPage].idSharedPage));
3072	rc = VERR_GMM_PAGE_NOT_SHARED;
3073	break;
3074	}
3075	}
3076	else
3077	{
3078	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (shared)\n", iPage, paPages[iPage].idSharedPage));
3079	rc = VERR_GMM_PAGE_NOT_FOUND;
3080	break;
3081	}
3082	}
3083	} /* for each page to update */
3084
3085	if (RT_SUCCESS(rc) && cPagesToAlloc > 0)
3086	{
3087	#ifdef VBOX_STRICT
3088	for (iPage = 0; iPage < cPagesToAlloc; iPage++)
3089	{
3090	Assert(paPages[iPage].HCPhysGCPhys == NIL_GMMPAGEDESC_PHYS);
3091	Assert(paPages[iPage].fZeroed == false);
3092	Assert(paPages[iPage].idPage == NIL_GMM_PAGEID);
3093	Assert(paPages[iPage].idSharedPage == NIL_GMM_PAGEID);
3094	}
3095	#endif
3096
3097	/*
3098	* Join paths with GMMR0AllocatePages for the allocation.
3099	* Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
3100	*/
3101	rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPagesToAlloc, paPages, GMMACCOUNT_BASE);
3102	}
3103	}
3104	else
3105	rc = VERR_WRONG_ORDER;
3106	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3107	}
3108	else
3109	rc = VERR_GMM_IS_NOT_SANE;
3110	gmmR0MutexRelease(pGMM);
3111	LogFlow(("GMMR0AllocateHandyPages: returns %Rrc\n", rc));
3112	return rc;
3113	}
3114
3115
3116	/**
3117	* Allocate one or more pages.
3118	*
3119	* This is typically used for ROMs and MMIO2 (VRAM) during VM creation.
3120	* The allocated pages are not cleared and will contain random garbage.
3121	*
3122	* @returns VBox status code:
3123	* @retval VINF_SUCCESS on success.
3124	* @retval VERR_NOT_OWNER if the caller is not an EMT.
3125	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
3126	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
3127	* that is we're trying to allocate more than we've reserved.
3128	*
3129	* @param pGVM The global (ring-0) VM structure.
3130	* @param idCpu The VCPU id.
3131	* @param cPages The number of pages to allocate.
3132	* @param paPages Pointer to the page descriptors.
3133	* See GMMPAGEDESC for details on what is expected on
3134	* input.
3135	* @param enmAccount The account to charge.
3136	*
3137	* @thread EMT.
3138	*/
3139	GMMR0DECL(int) GMMR0AllocatePages(PGVM pGVM, VMCPUID idCpu, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
3140	{
3141	LogFlow(("GMMR0AllocatePages: pGVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pGVM, cPages, paPages, enmAccount));
3142
3143	/*
3144	* Validate, get basics and take the semaphore.
3145	*/
3146	PGMM pGMM;
3147	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3148	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3149	if (RT_FAILURE(rc))
3150	return rc;
3151
3152	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3153	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3154	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3155
3156	for (unsigned iPage = 0; iPage < cPages; iPage++)
3157	{
3158	AssertMsgReturn( paPages[iPage].HCPhysGCPhys == NIL_GMMPAGEDESC_PHYS
3159	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE
3160	\|\| ( enmAccount == GMMACCOUNT_BASE
3161	&& paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
3162	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK)),
3163	("#%#x: %RHp enmAccount=%d\n", iPage, paPages[iPage].HCPhysGCPhys, enmAccount),
3164	VERR_INVALID_PARAMETER);
3165	AssertMsgReturn(paPages[iPage].fZeroed == false, ("#%#x: %#x\n", iPage, paPages[iPage].fZeroed), VERR_INVALID_PARAMETER);
3166	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3167	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
3168	}
3169
3170	/*
3171	* Grab the giant mutex and get working.
3172	*/
3173	gmmR0MutexAcquire(pGMM);
3174	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3175	{
3176
3177	/* No allocations before the initial reservation has been made! */
3178	if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
3179	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
3180	&& pGVM->gmm.s.Stats.Reserved.cShadowPages))
3181	rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPages, paPages, enmAccount);
3182	else
3183	rc = VERR_WRONG_ORDER;
3184	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3185	}
3186	else
3187	rc = VERR_GMM_IS_NOT_SANE;
3188	gmmR0MutexRelease(pGMM);
3189
3190	LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
3191	return rc;
3192	}
3193
3194
3195	/**
3196	* VMMR0 request wrapper for GMMR0AllocatePages.
3197	*
3198	* @returns see GMMR0AllocatePages.
3199	* @param pGVM The global (ring-0) VM structure.
3200	* @param idCpu The VCPU id.
3201	* @param pReq Pointer to the request packet.
3202	*/
3203	GMMR0DECL(int) GMMR0AllocatePagesReq(PGVM pGVM, VMCPUID idCpu, PGMMALLOCATEPAGESREQ pReq)
3204	{
3205	/*
3206	* Validate input and pass it on.
3207	*/
3208	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3209	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0]),
3210	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0])),
3211	VERR_INVALID_PARAMETER);
3212	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMALLOCATEPAGESREQ, aPages[pReq->cPages]),
3213	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF_DYN(GMMALLOCATEPAGESREQ, aPages[pReq->cPages])),
3214	VERR_INVALID_PARAMETER);
3215
3216	return GMMR0AllocatePages(pGVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3217	}
3218
3219
3220	/**
3221	* Allocate a large page to represent guest RAM
3222	*
3223	* The allocated pages are zeroed upon return.
3224	*
3225	* @returns VBox status code:
3226	* @retval VINF_SUCCESS on success.
3227	* @retval VERR_NOT_OWNER if the caller is not an EMT.
3228	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
3229	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
3230	* that is we're trying to allocate more than we've reserved.
3231	* @retval VERR_TRY_AGAIN if the host is temporarily out of large pages.
3232	* @returns see GMMR0AllocatePages.
3233	*
3234	* @param pGVM The global (ring-0) VM structure.
3235	* @param idCpu The VCPU id.
3236	* @param cbPage Large page size.
3237	* @param pIdPage Where to return the GMM page ID of the page.
3238	* @param pHCPhys Where to return the host physical address of the page.
3239	*/
3240	GMMR0DECL(int) GMMR0AllocateLargePage(PGVM pGVM, VMCPUID idCpu, uint32_t cbPage, uint32_t pIdPage, RTHCPHYS pHCPhys)
3241	{
3242	LogFlow(("GMMR0AllocateLargePage: pGVM=%p cbPage=%x\n", pGVM, cbPage));
3243
3244	AssertPtrReturn(pIdPage, VERR_INVALID_PARAMETER);
3245	*pIdPage = NIL_GMM_PAGEID;
3246	AssertPtrReturn(pHCPhys, VERR_INVALID_PARAMETER);
3247	*pHCPhys = NIL_RTHCPHYS;
3248	AssertReturn(cbPage == GMM_CHUNK_SIZE, VERR_INVALID_PARAMETER);
3249
3250	/*
3251	* Validate GVM + idCpu, get basics and take the semaphore.
3252	*/
3253	PGMM pGMM;
3254	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3255	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3256	AssertRCReturn(rc, rc);
3257
3258	VMMR0EMTBLOCKCTX Ctx;
3259	PGVMCPU pGVCpu = &pGVM->aCpus[idCpu];
3260	rc = VMMR0EmtPrepareToBlock(pGVCpu, VINF_SUCCESS, "GMMR0AllocateLargePage", pGMM, &Ctx);
3261	AssertRCReturn(rc, rc);
3262
3263	rc = gmmR0MutexAcquire(pGMM);
3264	if (RT_SUCCESS(rc))
3265	{
3266	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3267	{
3268	/*
3269	* Check the quota.
3270	*/
3271	/** @todo r=bird: Quota checking could be done w/o the giant mutex but using
3272	* a VM specific mutex... */
3273	if (RT_LIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + GMM_CHUNK_NUM_PAGES
3274	<= pGVM->gmm.s.Stats.Reserved.cBasePages))
3275	{
3276	/*
3277	* Allocate a new large page chunk.
3278	*
3279	* Note! We leave the giant GMM lock temporarily as the allocation might
3280	* take a long time. gmmR0RegisterChunk will retake it (ugly).
3281	*/
3282	AssertCompile(GMM_CHUNK_SIZE == _2M);
3283	gmmR0MutexRelease(pGMM);
3284
3285	RTR0MEMOBJ hMemObj;
3286	rc = RTR0MemObjAllocLarge(&hMemObj, GMM_CHUNK_SIZE, GMM_CHUNK_SIZE, RTMEMOBJ_ALLOC_LARGE_F_FAST);
3287	if (RT_SUCCESS(rc))
3288	{
3289	*pHCPhys = RTR0MemObjGetPagePhysAddr(hMemObj, 0);
3290
3291	/*
3292	* Register the chunk as fully allocated.
3293	* Note! As mentioned above, this will return owning the mutex on success.
3294	*/
3295	PGMMCHUNK pChunk = NULL;
3296	PGMMCHUNKFREESET const pSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
3297	rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, pGVM->pSession, GMM_CHUNK_FLAGS_LARGE_PAGE,
3298	0 /cPages/, NULL /paPages/, NULL /piPage/, &pChunk);
3299	if (RT_SUCCESS(rc))
3300	{
3301	/*
3302	* The gmmR0RegisterChunk call already marked all pages allocated,
3303	* so we just have to fill in the return values and update stats now.
3304	*/
3305	*pIdPage = pChunk->Core.Key << GMM_CHUNKID_SHIFT;
3306
3307	/* Update accounting. */
3308	pGVM->gmm.s.Stats.Allocated.cBasePages += GMM_CHUNK_NUM_PAGES;
3309	pGVM->gmm.s.Stats.cPrivatePages += GMM_CHUNK_NUM_PAGES;
3310	pGMM->cAllocatedPages += GMM_CHUNK_NUM_PAGES;
3311
3312	gmmR0LinkChunk(pChunk, pSet);
3313	gmmR0MutexRelease(pGMM);
3314
3315	VMMR0EmtResumeAfterBlocking(pGVCpu, &Ctx);
3316	LogFlow(("GMMR0AllocateLargePage: returns VINF_SUCCESS\n"));
3317	return VINF_SUCCESS;
3318	}
3319
3320	/*
3321	* Bail out.
3322	*/
3323	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
3324	*pHCPhys = NIL_RTHCPHYS;
3325	}
3326	/** @todo r=bird: Turn VERR_NO_MEMORY etc into VERR_TRY_AGAIN? Docs say we
3327	* return it, but I am sure IPRT doesn't... */
3328	}
3329	else
3330	{
3331	Log(("GMMR0AllocateLargePage: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
3332	pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages, GMM_CHUNK_NUM_PAGES));
3333	gmmR0MutexRelease(pGMM);
3334	rc = VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
3335	}
3336	}
3337	else
3338	{
3339	gmmR0MutexRelease(pGMM);
3340	rc = VERR_GMM_IS_NOT_SANE;
3341	}
3342	}
3343
3344	VMMR0EmtResumeAfterBlocking(pGVCpu, &Ctx);
3345	LogFlow(("GMMR0AllocateLargePage: returns %Rrc\n", rc));
3346	return rc;
3347	}
3348
3349
3350	/**
3351	* Free a large page.
3352	*
3353	* @returns VBox status code:
3354	* @param pGVM The global (ring-0) VM structure.
3355	* @param idCpu The VCPU id.
3356	* @param idPage The large page id.
3357	*/
3358	GMMR0DECL(int) GMMR0FreeLargePage(PGVM pGVM, VMCPUID idCpu, uint32_t idPage)
3359	{
3360	LogFlow(("GMMR0FreeLargePage: pGVM=%p idPage=%x\n", pGVM, idPage));
3361
3362	/*
3363	* Validate, get basics and take the semaphore.
3364	*/
3365	PGMM pGMM;
3366	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3367	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3368	if (RT_FAILURE(rc))
3369	return rc;
3370
3371	gmmR0MutexAcquire(pGMM);
3372	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3373	{
3374	const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3375
3376	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3377	{
3378	Log(("GMMR0FreeLargePage: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3379	gmmR0MutexRelease(pGMM);
3380	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3381	}
3382
3383	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3384	if (RT_LIKELY( pPage
3385	&& GMM_PAGE_IS_PRIVATE(pPage)))
3386	{
3387	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3388	Assert(pChunk);
3389	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3390	Assert(pChunk->cPrivate > 0);
3391
3392	/* Release the memory immediately. */
3393	gmmR0FreeChunk(pGMM, NULL, pChunk, false /fRelaxedSem/); /** @todo this can be relaxed too! */
3394
3395	/* Update accounting. */
3396	pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages;
3397	pGVM->gmm.s.Stats.cPrivatePages -= cPages;
3398	pGMM->cAllocatedPages -= cPages;
3399	}
3400	else
3401	rc = VERR_GMM_PAGE_NOT_FOUND;
3402	}
3403	else
3404	rc = VERR_GMM_IS_NOT_SANE;
3405
3406	gmmR0MutexRelease(pGMM);
3407	LogFlow(("GMMR0FreeLargePage: returns %Rrc\n", rc));
3408	return rc;
3409	}
3410
3411
3412	/**
3413	* VMMR0 request wrapper for GMMR0FreeLargePage.
3414	*
3415	* @returns see GMMR0FreeLargePage.
3416	* @param pGVM The global (ring-0) VM structure.
3417	* @param idCpu The VCPU id.
3418	* @param pReq Pointer to the request packet.
3419	*/
3420	GMMR0DECL(int) GMMR0FreeLargePageReq(PGVM pGVM, VMCPUID idCpu, PGMMFREELARGEPAGEREQ pReq)
3421	{
3422	/*
3423	* Validate input and pass it on.
3424	*/
3425	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3426	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMFREEPAGESREQ),
3427	("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(GMMFREEPAGESREQ)),
3428	VERR_INVALID_PARAMETER);
3429
3430	return GMMR0FreeLargePage(pGVM, idCpu, pReq->idPage);
3431	}
3432
3433
3434	/**
3435	* @callback_method_impl{FNGVMMR0ENUMCALLBACK,
3436	* Used by gmmR0FreeChunkFlushPerVmTlbs().}
3437	*/
3438	static DECLCALLBACK(int) gmmR0InvalidatePerVmChunkTlbCallback(PGVM pGVM, void *pvUser)
3439	{
3440	RT_NOREF(pvUser);
3441	if (pGVM->gmm.s.hChunkTlbSpinLock != NIL_RTSPINLOCK)
3442	{
3443	RTSpinlockAcquire(pGVM->gmm.s.hChunkTlbSpinLock);
3444	uintptr_t i = RT_ELEMENTS(pGVM->gmm.s.aChunkTlbEntries);
3445	while (i-- > 0)
3446	{
3447	pGVM->gmm.s.aChunkTlbEntries[i].idGeneration = UINT64_MAX;
3448	pGVM->gmm.s.aChunkTlbEntries[i].pChunk = NULL;
3449	}
3450	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
3451	}
3452	return VINF_SUCCESS;
3453	}
3454
3455
3456	/**
3457	* Called by gmmR0FreeChunk when we reach the threshold for wrapping around the
3458	* free generation ID value.
3459	*
3460	* This is done at 2^62 - 1, which allows us to drop all locks and as it will
3461	* take a while before 12 exa (2 305 843 009 213 693 952) calls to
3462	* gmmR0FreeChunk can be made and causes a real wrap-around. We do two
3463	* invalidation passes and resets the generation ID between then. This will
3464	* make sure there are no false positives.
3465	*
3466	* @param pGMM Pointer to the GMM instance.
3467	*/
3468	static void gmmR0FreeChunkFlushPerVmTlbs(PGMM pGMM)
3469	{
3470	/*
3471	* First invalidation pass.
3472	*/
3473	int rc = GVMMR0EnumVMs(gmmR0InvalidatePerVmChunkTlbCallback, NULL);
3474	AssertRCSuccess(rc);
3475
3476	/*
3477	* Reset the generation number.
3478	*/
3479	RTSpinlockAcquire(pGMM->hSpinLockTree);
3480	ASMAtomicWriteU64(&pGMM->idFreeGeneration, 1);
3481	RTSpinlockRelease(pGMM->hSpinLockTree);
3482
3483	/*
3484	* Second invalidation pass.
3485	*/
3486	rc = GVMMR0EnumVMs(gmmR0InvalidatePerVmChunkTlbCallback, NULL);
3487	AssertRCSuccess(rc);
3488	}
3489
3490
3491	/**
3492	* Frees a chunk, giving it back to the host OS.
3493	*
3494	* @param pGMM Pointer to the GMM instance.
3495	* @param pGVM This is set when called from GMMR0CleanupVM so we can
3496	* unmap and free the chunk in one go.
3497	* @param pChunk The chunk to free.
3498	* @param fRelaxedSem Whether we can release the semaphore while doing the
3499	* freeing (@c true) or not.
3500	*/
3501	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3502	{
3503	Assert(pChunk->Core.Key != NIL_GMM_CHUNKID);
3504
3505	GMMR0CHUNKMTXSTATE MtxState;
3506	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
3507
3508	/*
3509	* Cleanup hack! Unmap the chunk from the callers address space.
3510	* This shouldn't happen, so screw lock contention...
3511	*/
3512	if (pChunk->cMappingsX && pGVM)
3513	gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3514
3515	/*
3516	* If there are current mappings of the chunk, then request the
3517	* VMs to unmap them. Reposition the chunk in the free list so
3518	* it won't be a likely candidate for allocations.
3519	*/
3520	if (pChunk->cMappingsX)
3521	{
3522	/** @todo R0 -> VM request */
3523	/* The chunk can be mapped by more than one VM if fBoundMemoryMode is false! */
3524	Log(("gmmR0FreeChunk: chunk still has %d mappings; don't free!\n", pChunk->cMappingsX));
3525	gmmR0ChunkMutexRelease(&MtxState, pChunk);
3526	return false;
3527	}
3528
3529
3530	/*
3531	* Save and trash the handle.
3532	*/
3533	RTR0MEMOBJ const hMemObj = pChunk->hMemObj;
3534	pChunk->hMemObj = NIL_RTR0MEMOBJ;
3535
3536	/*
3537	* Unlink it from everywhere.
3538	*/
3539	gmmR0UnlinkChunk(pChunk);
3540
3541	RTSpinlockAcquire(pGMM->hSpinLockTree);
3542
3543	RTListNodeRemove(&pChunk->ListNode);
3544
3545	PAVLU32NODECORE pCore = RTAvlU32Remove(&pGMM->pChunks, pChunk->Core.Key);
3546	Assert(pCore == &pChunk->Core); NOREF(pCore);
3547
3548	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(pChunk->Core.Key)];
3549	if (pTlbe->pChunk == pChunk)
3550	{
3551	pTlbe->idChunk = NIL_GMM_CHUNKID;
3552	pTlbe->pChunk = NULL;
3553	}
3554
3555	Assert(pGMM->cChunks > 0);
3556	pGMM->cChunks--;
3557
3558	uint64_t const idFreeGeneration = ASMAtomicIncU64(&pGMM->idFreeGeneration);
3559
3560	RTSpinlockRelease(pGMM->hSpinLockTree);
3561
3562	pGMM->cFreedChunks++;
3563
3564	/* Drop the lock. */
3565	gmmR0ChunkMutexRelease(&MtxState, NULL);
3566	if (fRelaxedSem)
3567	gmmR0MutexRelease(pGMM);
3568
3569	/*
3570	* Flush per VM chunk TLBs if we're getting remotely close to a generation wraparound.
3571	*/
3572	if (idFreeGeneration == UINT64_MAX / 4)
3573	gmmR0FreeChunkFlushPerVmTlbs(pGMM);
3574
3575	/*
3576	* Free the Chunk ID and all memory associated with the chunk.
3577	*/
3578	gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
3579	pChunk->Core.Key = NIL_GMM_CHUNKID;
3580
3581	RTMemFree(pChunk->paMappingsX);
3582	pChunk->paMappingsX = NULL;
3583
3584	RTMemFree(pChunk);
3585
3586	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
3587	int rc = RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
3588	#else
3589	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
3590	#endif
3591	AssertLogRelRC(rc);
3592
3593	if (fRelaxedSem)
3594	gmmR0MutexAcquire(pGMM);
3595	return fRelaxedSem;
3596	}
3597
3598
3599	/**
3600	* Free page worker.
3601	*
3602	* The caller does all the statistic decrementing, we do all the incrementing.
3603	*
3604	* @param pGMM Pointer to the GMM instance data.
3605	* @param pGVM Pointer to the GVM instance.
3606	* @param pChunk Pointer to the chunk this page belongs to.
3607	* @param idPage The Page ID.
3608	* @param pPage Pointer to the page.
3609	*/
3610	static void gmmR0FreePageWorker(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint32_t idPage, PGMMPAGE pPage)
3611	{
3612	Log3(("F pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x\n",
3613	pPage, pPage - &pChunk->aPages[0], idPage, pPage->Common.u2State, pChunk->iFreeHead)); NOREF(idPage);
3614
3615	/*
3616	* Put the page on the free list.
3617	*/
3618	pPage->u = 0;
3619	pPage->Free.u2State = GMM_PAGE_STATE_FREE;
3620	pPage->Free.fZeroed = false;
3621	Assert(pChunk->iFreeHead < RT_ELEMENTS(pChunk->aPages) \|\| pChunk->iFreeHead == UINT16_MAX);
3622	pPage->Free.iNext = pChunk->iFreeHead;
3623	pChunk->iFreeHead = pPage - &pChunk->aPages[0];
3624
3625	/*
3626	* Update statistics (the cShared/cPrivate stats are up to date already),
3627	* and relink the chunk if necessary.
3628	*/
3629	unsigned const cFree = pChunk->cFree;
3630	if ( !cFree
3631	\|\| gmmR0SelectFreeSetList(cFree) != gmmR0SelectFreeSetList(cFree + 1))
3632	{
3633	gmmR0UnlinkChunk(pChunk);
3634	pChunk->cFree++;
3635	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
3636	}
3637	else
3638	{
3639	pChunk->cFree = cFree + 1;
3640	pChunk->pSet->cFreePages++;
3641	}
3642
3643	/*
3644	* If the chunk becomes empty, consider giving memory back to the host OS.
3645	*
3646	* The current strategy is to try give it back if there are other chunks
3647	* in this free list, meaning if there are at least 240 free pages in this
3648	* category. Note that since there are probably mappings of the chunk,
3649	* it won't be freed up instantly, which probably screws up this logic
3650	* a bit...
3651	*/
3652	/** @todo Do this on the way out. */
3653	if (RT_LIKELY( pChunk->cFree != GMM_CHUNK_NUM_PAGES
3654	\|\| pChunk->pFreeNext == NULL
3655	\|\| pChunk->pFreePrev == NULL /** @todo this is probably misfiring, see reset... */))
3656	{ /* likely */ }
3657	else
3658	gmmR0FreeChunk(pGMM, NULL, pChunk, false);
3659	}
3660
3661
3662	/**
3663	* Frees a shared page, the page is known to exist and be valid and such.
3664	*
3665	* @param pGMM Pointer to the GMM instance.
3666	* @param pGVM Pointer to the GVM instance.
3667	* @param idPage The page id.
3668	* @param pPage The page structure.
3669	*/
3670	DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3671	{
3672	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3673	Assert(pChunk);
3674	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3675	Assert(pChunk->cShared > 0);
3676	Assert(pGMM->cSharedPages > 0);
3677	Assert(pGMM->cAllocatedPages > 0);
3678	Assert(!pPage->Shared.cRefs);
3679
3680	pChunk->cShared--;
3681	pGMM->cAllocatedPages--;
3682	pGMM->cSharedPages--;
3683	gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3684	}
3685
3686
3687	/**
3688	* Frees a private page, the page is known to exist and be valid and such.
3689	*
3690	* @param pGMM Pointer to the GMM instance.
3691	* @param pGVM Pointer to the GVM instance.
3692	* @param idPage The page id.
3693	* @param pPage The page structure.
3694	*/
3695	DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3696	{
3697	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3698	Assert(pChunk);
3699	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3700	Assert(pChunk->cPrivate > 0);
3701	Assert(pGMM->cAllocatedPages > 0);
3702
3703	pChunk->cPrivate--;
3704	pGMM->cAllocatedPages--;
3705	gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3706	}
3707
3708
3709	/**
3710	* Common worker for GMMR0FreePages and GMMR0BalloonedPages.
3711	*
3712	* @returns VBox status code:
3713	* @retval xxx
3714	*
3715	* @param pGMM Pointer to the GMM instance data.
3716	* @param pGVM Pointer to the VM.
3717	* @param cPages The number of pages to free.
3718	* @param paPages Pointer to the page descriptors.
3719	* @param enmAccount The account this relates to.
3720	*/
3721	static int gmmR0FreePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3722	{
3723	/*
3724	* Check that the request isn't impossible wrt to the account status.
3725	*/
3726	switch (enmAccount)
3727	{
3728	case GMMACCOUNT_BASE:
3729	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3730	{
3731	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3732	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3733	}
3734	break;
3735	case GMMACCOUNT_SHADOW:
3736	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages < cPages))
3737	{
3738	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
3739	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3740	}
3741	break;
3742	case GMMACCOUNT_FIXED:
3743	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages < cPages))
3744	{
3745	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
3746	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3747	}
3748	break;
3749	default:
3750	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3751	}
3752
3753	/*
3754	* Walk the descriptors and free the pages.
3755	*
3756	* Statistics (except the account) are being updated as we go along,
3757	* unlike the alloc code. Also, stop on the first error.
3758	*/
3759	int rc = VINF_SUCCESS;
3760	uint32_t iPage;
3761	for (iPage = 0; iPage < cPages; iPage++)
3762	{
3763	uint32_t idPage = paPages[iPage].idPage;
3764	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3765	if (RT_LIKELY(pPage))
3766	{
3767	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
3768	{
3769	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
3770	{
3771	Assert(pGVM->gmm.s.Stats.cPrivatePages);
3772	pGVM->gmm.s.Stats.cPrivatePages--;
3773	gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
3774	}
3775	else
3776	{
3777	Log(("gmmR0AllocatePages: #%#x/%#x: not owner! hGVM=%#x hSelf=%#x\n", iPage, idPage,
3778	pPage->Private.hGVM, pGVM->hSelf));
3779	rc = VERR_GMM_NOT_PAGE_OWNER;
3780	break;
3781	}
3782	}
3783	else if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3784	{
3785	Assert(pGVM->gmm.s.Stats.cSharedPages);
3786	Assert(pPage->Shared.cRefs);
3787	#if defined(VBOX_WITH_PAGE_SHARING) && defined(VBOX_STRICT)
3788	if (pPage->Shared.u14Checksum)
3789	{
3790	uint32_t uChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
3791	uChecksum &= UINT32_C(0x00003fff);
3792	AssertMsg(!uChecksum \|\| uChecksum == pPage->Shared.u14Checksum,
3793	("%#x vs %#x - idPage=%#x\n", uChecksum, pPage->Shared.u14Checksum, idPage));
3794	}
3795	#endif
3796	pGVM->gmm.s.Stats.cSharedPages--;
3797	if (!--pPage->Shared.cRefs)
3798	gmmR0FreeSharedPage(pGMM, pGVM, idPage, pPage);
3799	else
3800	{
3801	Assert(pGMM->cDuplicatePages);
3802	pGMM->cDuplicatePages--;
3803	}
3804	}
3805	else
3806	{
3807	Log(("gmmR0AllocatePages: #%#x/%#x: already free!\n", iPage, idPage));
3808	rc = VERR_GMM_PAGE_ALREADY_FREE;
3809	break;
3810	}
3811	}
3812	else
3813	{
3814	Log(("gmmR0AllocatePages: #%#x/%#x: not found!\n", iPage, idPage));
3815	rc = VERR_GMM_PAGE_NOT_FOUND;
3816	break;
3817	}
3818	paPages[iPage].idPage = NIL_GMM_PAGEID;
3819	}
3820
3821	/*
3822	* Update the account.
3823	*/
3824	switch (enmAccount)
3825	{
3826	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= iPage; break;
3827	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= iPage; break;
3828	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= iPage; break;
3829	default:
3830	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3831	}
3832
3833	/*
3834	* Any threshold stuff to be done here?
3835	*/
3836
3837	return rc;
3838	}
3839
3840
3841	/**
3842	* Free one or more pages.
3843	*
3844	* This is typically used at reset time or power off.
3845	*
3846	* @returns VBox status code:
3847	* @retval xxx
3848	*
3849	* @param pGVM The global (ring-0) VM structure.
3850	* @param idCpu The VCPU id.
3851	* @param cPages The number of pages to allocate.
3852	* @param paPages Pointer to the page descriptors containing the page IDs
3853	* for each page.
3854	* @param enmAccount The account this relates to.
3855	* @thread EMT.
3856	*/
3857	GMMR0DECL(int) GMMR0FreePages(PGVM pGVM, VMCPUID idCpu, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3858	{
3859	LogFlow(("GMMR0FreePages: pGVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pGVM, cPages, paPages, enmAccount));
3860
3861	/*
3862	* Validate input and get the basics.
3863	*/
3864	PGMM pGMM;
3865	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3866	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3867	if (RT_FAILURE(rc))
3868	return rc;
3869
3870	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3871	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3872	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3873
3874	for (unsigned iPage = 0; iPage < cPages; iPage++)
3875	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
3876	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
3877	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3878
3879	/*
3880	* Take the semaphore and call the worker function.
3881	*/
3882	gmmR0MutexAcquire(pGMM);
3883	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3884	{
3885	rc = gmmR0FreePages(pGMM, pGVM, cPages, paPages, enmAccount);
3886	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3887	}
3888	else
3889	rc = VERR_GMM_IS_NOT_SANE;
3890	gmmR0MutexRelease(pGMM);
3891	LogFlow(("GMMR0FreePages: returns %Rrc\n", rc));
3892	return rc;
3893	}
3894
3895
3896	/**
3897	* VMMR0 request wrapper for GMMR0FreePages.
3898	*
3899	* @returns see GMMR0FreePages.
3900	* @param pGVM The global (ring-0) VM structure.
3901	* @param idCpu The VCPU id.
3902	* @param pReq Pointer to the request packet.
3903	*/
3904	GMMR0DECL(int) GMMR0FreePagesReq(PGVM pGVM, VMCPUID idCpu, PGMMFREEPAGESREQ pReq)
3905	{
3906	/*
3907	* Validate input and pass it on.
3908	*/
3909	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3910	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0]),
3911	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0])),
3912	VERR_INVALID_PARAMETER);
3913	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMFREEPAGESREQ, aPages[pReq->cPages]),
3914	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF_DYN(GMMFREEPAGESREQ, aPages[pReq->cPages])),
3915	VERR_INVALID_PARAMETER);
3916
3917	return GMMR0FreePages(pGVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3918	}
3919
3920
3921	/**
3922	* Report back on a memory ballooning request.
3923	*
3924	* The request may or may not have been initiated by the GMM. If it was initiated
3925	* by the GMM it is important that this function is called even if no pages were
3926	* ballooned.
3927	*
3928	* @returns VBox status code:
3929	* @retval VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH
3930	* @retval VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH
3931	* @retval VERR_GMM_OVERCOMMITTED_TRY_AGAIN_IN_A_BIT - reset condition
3932	* indicating that we won't necessarily have sufficient RAM to boot
3933	* the VM again and that it should pause until this changes (we'll try
3934	* balloon some other VM). (For standard deflate we have little choice
3935	* but to hope the VM won't use the memory that was returned to it.)
3936	*
3937	* @param pGVM The global (ring-0) VM structure.
3938	* @param idCpu The VCPU id.
3939	* @param enmAction Inflate/deflate/reset.
3940	* @param cBalloonedPages The number of pages that was ballooned.
3941	*
3942	* @thread EMT(idCpu)
3943	*/
3944	GMMR0DECL(int) GMMR0BalloonedPages(PGVM pGVM, VMCPUID idCpu, GMMBALLOONACTION enmAction, uint32_t cBalloonedPages)
3945	{
3946	LogFlow(("GMMR0BalloonedPages: pGVM=%p enmAction=%d cBalloonedPages=%#x\n",
3947	pGVM, enmAction, cBalloonedPages));
3948
3949	AssertMsgReturn(cBalloonedPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cBalloonedPages), VERR_INVALID_PARAMETER);
3950
3951	/*
3952	* Validate input and get the basics.
3953	*/
3954	PGMM pGMM;
3955	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3956	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3957	if (RT_FAILURE(rc))
3958	return rc;
3959
3960	/*
3961	* Take the semaphore and do some more validations.
3962	*/
3963	gmmR0MutexAcquire(pGMM);
3964	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3965	{
3966	switch (enmAction)
3967	{
3968	case GMMBALLOONACTION_INFLATE:
3969	{
3970	if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cBalloonedPages
3971	<= pGVM->gmm.s.Stats.Reserved.cBasePages))
3972	{
3973	/*
3974	* Record the ballooned memory.
3975	*/
3976	pGMM->cBalloonedPages += cBalloonedPages;
3977	if (pGVM->gmm.s.Stats.cReqBalloonedPages)
3978	{
3979	/* Codepath never taken. Might be interesting in the future to request ballooned memory from guests in low memory conditions.. */
3980	AssertFailed();
3981
3982	pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3983	pGVM->gmm.s.Stats.cReqActuallyBalloonedPages += cBalloonedPages;
3984	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx Req=%#llx Actual=%#llx (pending)\n",
3985	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages,
3986	pGVM->gmm.s.Stats.cReqBalloonedPages, pGVM->gmm.s.Stats.cReqActuallyBalloonedPages));
3987	}
3988	else
3989	{
3990	pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3991	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3992	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3993	}
3994	}
3995	else
3996	{
3997	Log(("GMMR0BalloonedPages: cBasePages=%#llx Total=%#llx cBalloonedPages=%#llx Reserved=%#llx\n",
3998	pGVM->gmm.s.Stats.Allocated.cBasePages, pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages,
3999	pGVM->gmm.s.Stats.Reserved.cBasePages));
4000	rc = VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
4001	}
4002	break;
4003	}
4004
4005	case GMMBALLOONACTION_DEFLATE:
4006	{
4007	/* Deflate. */
4008	if (pGVM->gmm.s.Stats.cBalloonedPages >= cBalloonedPages)
4009	{
4010	/*
4011	* Record the ballooned memory.
4012	*/
4013	Assert(pGMM->cBalloonedPages >= cBalloonedPages);
4014	pGMM->cBalloonedPages -= cBalloonedPages;
4015	pGVM->gmm.s.Stats.cBalloonedPages -= cBalloonedPages;
4016	if (pGVM->gmm.s.Stats.cReqDeflatePages)
4017	{
4018	AssertFailed(); /* This is path is for later. */
4019	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx Req=%#llx\n",
4020	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages, pGVM->gmm.s.Stats.cReqDeflatePages));
4021
4022	/*
4023	* Anything we need to do here now when the request has been completed?
4024	*/
4025	pGVM->gmm.s.Stats.cReqDeflatePages = 0;
4026	}
4027	else
4028	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx (user)\n",
4029	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
4030	}
4031	else
4032	{
4033	Log(("GMMR0BalloonedPages: Total=%#llx cBalloonedPages=%#llx\n", pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages));
4034	rc = VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH;
4035	}
4036	break;
4037	}
4038
4039	case GMMBALLOONACTION_RESET:
4040	{
4041	/* Reset to an empty balloon. */
4042	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
4043
4044	pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
4045	pGVM->gmm.s.Stats.cBalloonedPages = 0;
4046	break;
4047	}
4048
4049	default:
4050	rc = VERR_INVALID_PARAMETER;
4051	break;
4052	}
4053	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4054	}
4055	else
4056	rc = VERR_GMM_IS_NOT_SANE;
4057
4058	gmmR0MutexRelease(pGMM);
4059	LogFlow(("GMMR0BalloonedPages: returns %Rrc\n", rc));
4060	return rc;
4061	}
4062
4063
4064	/**
4065	* VMMR0 request wrapper for GMMR0BalloonedPages.
4066	*
4067	* @returns see GMMR0BalloonedPages.
4068	* @param pGVM The global (ring-0) VM structure.
4069	* @param idCpu The VCPU id.
4070	* @param pReq Pointer to the request packet.
4071	*/
4072	GMMR0DECL(int) GMMR0BalloonedPagesReq(PGVM pGVM, VMCPUID idCpu, PGMMBALLOONEDPAGESREQ pReq)
4073	{
4074	/*
4075	* Validate input and pass it on.
4076	*/
4077	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4078	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMBALLOONEDPAGESREQ),
4079	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMBALLOONEDPAGESREQ)),
4080	VERR_INVALID_PARAMETER);
4081
4082	return GMMR0BalloonedPages(pGVM, idCpu, pReq->enmAction, pReq->cBalloonedPages);
4083	}
4084
4085
4086	/**
4087	* Return memory statistics for the hypervisor
4088	*
4089	* @returns VBox status code.
4090	* @param pReq Pointer to the request packet.
4091	*/
4092	GMMR0DECL(int) GMMR0QueryHypervisorMemoryStatsReq(PGMMMEMSTATSREQ pReq)
4093	{
4094	/*
4095	* Validate input and pass it on.
4096	*/
4097	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4098	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
4099	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
4100	VERR_INVALID_PARAMETER);
4101
4102	/*
4103	* Validate input and get the basics.
4104	*/
4105	PGMM pGMM;
4106	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4107	pReq->cAllocPages = pGMM->cAllocatedPages;
4108	pReq->cFreePages = (pGMM->cChunks << (GMM_CHUNK_SHIFT- PAGE_SHIFT)) - pGMM->cAllocatedPages;
4109	pReq->cBalloonedPages = pGMM->cBalloonedPages;
4110	pReq->cMaxPages = pGMM->cMaxPages;
4111	pReq->cSharedPages = pGMM->cDuplicatePages;
4112	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4113
4114	return VINF_SUCCESS;
4115	}
4116
4117
4118	/**
4119	* Return memory statistics for the VM
4120	*
4121	* @returns VBox status code.
4122	* @param pGVM The global (ring-0) VM structure.
4123	* @param idCpu Cpu id.
4124	* @param pReq Pointer to the request packet.
4125	*
4126	* @thread EMT(idCpu)
4127	*/
4128	GMMR0DECL(int) GMMR0QueryMemoryStatsReq(PGVM pGVM, VMCPUID idCpu, PGMMMEMSTATSREQ pReq)
4129	{
4130	/*
4131	* Validate input and pass it on.
4132	*/
4133	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4134	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
4135	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
4136	VERR_INVALID_PARAMETER);
4137
4138	/*
4139	* Validate input and get the basics.
4140	*/
4141	PGMM pGMM;
4142	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4143	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4144	if (RT_FAILURE(rc))
4145	return rc;
4146
4147	/*
4148	* Take the semaphore and do some more validations.
4149	*/
4150	gmmR0MutexAcquire(pGMM);
4151	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4152	{
4153	pReq->cAllocPages = pGVM->gmm.s.Stats.Allocated.cBasePages;
4154	pReq->cBalloonedPages = pGVM->gmm.s.Stats.cBalloonedPages;
4155	pReq->cMaxPages = pGVM->gmm.s.Stats.Reserved.cBasePages;
4156	pReq->cFreePages = pReq->cMaxPages - pReq->cAllocPages;
4157	}
4158	else
4159	rc = VERR_GMM_IS_NOT_SANE;
4160
4161	gmmR0MutexRelease(pGMM);
4162	LogFlow(("GMMR3QueryVMMemoryStats: returns %Rrc\n", rc));
4163	return rc;
4164	}
4165
4166
4167	/**
4168	* Worker for gmmR0UnmapChunk and gmmr0FreeChunk.
4169	*
4170	* Don't call this in legacy allocation mode!
4171	*
4172	* @returns VBox status code.
4173	* @param pGMM Pointer to the GMM instance data.
4174	* @param pGVM Pointer to the Global VM structure.
4175	* @param pChunk Pointer to the chunk to be unmapped.
4176	*/
4177	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
4178	{
4179	RT_NOREF_PV(pGMM);
4180
4181	/*
4182	* Find the mapping and try unmapping it.
4183	*/
4184	uint32_t cMappings = pChunk->cMappingsX;
4185	for (uint32_t i = 0; i < cMappings; i++)
4186	{
4187	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4188	if (pChunk->paMappingsX[i].pGVM == pGVM)
4189	{
4190	/* unmap */
4191	int rc = RTR0MemObjFree(pChunk->paMappingsX[i].hMapObj, false /* fFreeMappings (NA) */);
4192	if (RT_SUCCESS(rc))
4193	{
4194	/* update the record. */
4195	cMappings--;
4196	if (i < cMappings)
4197	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
4198	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
4199	pChunk->paMappingsX[cMappings].pGVM = NULL;
4200	Assert(pChunk->cMappingsX - 1U == cMappings);
4201	pChunk->cMappingsX = cMappings;
4202	}
4203
4204	return rc;
4205	}
4206	}
4207
4208	Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
4209	return VERR_GMM_CHUNK_NOT_MAPPED;
4210	}
4211
4212
4213	/**
4214	* Unmaps a chunk previously mapped into the address space of the current process.
4215	*
4216	* @returns VBox status code.
4217	* @param pGMM Pointer to the GMM instance data.
4218	* @param pGVM Pointer to the Global VM structure.
4219	* @param pChunk Pointer to the chunk to be unmapped.
4220	* @param fRelaxedSem Whether we can release the semaphore while doing the
4221	* mapping (@c true) or not.
4222	*/
4223	static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
4224	{
4225	/*
4226	* Lock the chunk and if possible leave the giant GMM lock.
4227	*/
4228	GMMR0CHUNKMTXSTATE MtxState;
4229	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
4230	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
4231	if (RT_SUCCESS(rc))
4232	{
4233	rc = gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
4234	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4235	}
4236	return rc;
4237	}
4238
4239
4240	/**
4241	* Worker for gmmR0MapChunk.
4242	*
4243	* @returns VBox status code.
4244	* @param pGMM Pointer to the GMM instance data.
4245	* @param pGVM Pointer to the Global VM structure.
4246	* @param pChunk Pointer to the chunk to be mapped.
4247	* @param ppvR3 Where to store the ring-3 address of the mapping.
4248	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
4249	* contain the address of the existing mapping.
4250	*/
4251	static int gmmR0MapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4252	{
4253	RT_NOREF(pGMM);
4254
4255	/*
4256	* Check to see if the chunk is already mapped.
4257	*/
4258	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4259	{
4260	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4261	if (pChunk->paMappingsX[i].pGVM == pGVM)
4262	{
4263	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4264	Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
4265	#ifdef VBOX_WITH_PAGE_SHARING
4266	/* The ring-3 chunk cache can be out of sync; don't fail. */
4267	return VINF_SUCCESS;
4268	#else
4269	return VERR_GMM_CHUNK_ALREADY_MAPPED;
4270	#endif
4271	}
4272	}
4273
4274	/*
4275	* Do the mapping.
4276	*/
4277	RTR0MEMOBJ hMapObj;
4278	int rc = RTR0MemObjMapUser(&hMapObj, pChunk->hMemObj, (RTR3PTR)-1, 0, RTMEM_PROT_READ \| RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4279	if (RT_SUCCESS(rc))
4280	{
4281	/* reallocate the array? assumes few users per chunk (usually one). */
4282	unsigned iMapping = pChunk->cMappingsX;
4283	if ( iMapping <= 3
4284	\|\| (iMapping & 3) == 0)
4285	{
4286	unsigned cNewSize = iMapping <= 3
4287	? iMapping + 1
4288	: iMapping + 4;
4289	Assert(cNewSize < 4 \|\| RT_ALIGN_32(cNewSize, 4) == cNewSize);
4290	if (RT_UNLIKELY(cNewSize > UINT16_MAX))
4291	{
4292	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4293	return VERR_GMM_TOO_MANY_CHUNK_MAPPINGS;
4294	}
4295
4296	void pvMappings = RTMemRealloc(pChunk->paMappingsX, cNewSize sizeof(pChunk->paMappingsX[0]));
4297	if (RT_UNLIKELY(!pvMappings))
4298	{
4299	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4300	return VERR_NO_MEMORY;
4301	}
4302	pChunk->paMappingsX = (PGMMCHUNKMAP)pvMappings;
4303	}
4304
4305	/* insert new entry */
4306	pChunk->paMappingsX[iMapping].hMapObj = hMapObj;
4307	pChunk->paMappingsX[iMapping].pGVM = pGVM;
4308	Assert(pChunk->cMappingsX == iMapping);
4309	pChunk->cMappingsX = iMapping + 1;
4310
4311	*ppvR3 = RTR0MemObjAddressR3(hMapObj);
4312	}
4313
4314	return rc;
4315	}
4316
4317
4318	/**
4319	* Maps a chunk into the user address space of the current process.
4320	*
4321	* @returns VBox status code.
4322	* @param pGMM Pointer to the GMM instance data.
4323	* @param pGVM Pointer to the Global VM structure.
4324	* @param pChunk Pointer to the chunk to be mapped.
4325	* @param fRelaxedSem Whether we can release the semaphore while doing the
4326	* mapping (@c true) or not.
4327	* @param ppvR3 Where to store the ring-3 address of the mapping.
4328	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
4329	* contain the address of the existing mapping.
4330	*/
4331	static int gmmR0MapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem, PRTR3PTR ppvR3)
4332	{
4333	/*
4334	* Take the chunk lock and leave the giant GMM lock when possible, then
4335	* call the worker function.
4336	*/
4337	GMMR0CHUNKMTXSTATE MtxState;
4338	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
4339	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
4340	if (RT_SUCCESS(rc))
4341	{
4342	rc = gmmR0MapChunkLocked(pGMM, pGVM, pChunk, ppvR3);
4343	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4344	}
4345
4346	return rc;
4347	}
4348
4349
4350
4351	#if defined(VBOX_WITH_PAGE_SHARING) \|\| defined(VBOX_STRICT)
4352	/**
4353	* Check if a chunk is mapped into the specified VM
4354	*
4355	* @returns mapped yes/no
4356	* @param pGMM Pointer to the GMM instance.
4357	* @param pGVM Pointer to the Global VM structure.
4358	* @param pChunk Pointer to the chunk to be mapped.
4359	* @param ppvR3 Where to store the ring-3 address of the mapping.
4360	*/
4361	static bool gmmR0IsChunkMapped(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4362	{
4363	GMMR0CHUNKMTXSTATE MtxState;
4364	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
4365	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4366	{
4367	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4368	if (pChunk->paMappingsX[i].pGVM == pGVM)
4369	{
4370	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4371	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4372	return true;
4373	}
4374	}
4375	*ppvR3 = NULL;
4376	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4377	return false;
4378	}
4379	#endif /* VBOX_WITH_PAGE_SHARING \|\| VBOX_STRICT */
4380
4381
4382	/**
4383	* Map a chunk and/or unmap another chunk.
4384	*
4385	* The mapping and unmapping applies to the current process.
4386	*
4387	* This API does two things because it saves a kernel call per mapping when
4388	* when the ring-3 mapping cache is full.
4389	*
4390	* @returns VBox status code.
4391	* @param pGVM The global (ring-0) VM structure.
4392	* @param idChunkMap The chunk to map. NIL_GMM_CHUNKID if nothing to map.
4393	* @param idChunkUnmap The chunk to unmap. NIL_GMM_CHUNKID if nothing to unmap.
4394	* @param ppvR3 Where to store the address of the mapped chunk. NULL is ok if nothing to map.
4395	* @thread EMT ???
4396	*/
4397	GMMR0DECL(int) GMMR0MapUnmapChunk(PGVM pGVM, uint32_t idChunkMap, uint32_t idChunkUnmap, PRTR3PTR ppvR3)
4398	{
4399	LogFlow(("GMMR0MapUnmapChunk: pGVM=%p idChunkMap=%#x idChunkUnmap=%#x ppvR3=%p\n",
4400	pGVM, idChunkMap, idChunkUnmap, ppvR3));
4401
4402	/*
4403	* Validate input and get the basics.
4404	*/
4405	PGMM pGMM;
4406	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4407	int rc = GVMMR0ValidateGVM(pGVM);
4408	if (RT_FAILURE(rc))
4409	return rc;
4410
4411	AssertCompile(NIL_GMM_CHUNKID == 0);
4412	AssertMsgReturn(idChunkMap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkMap), VERR_INVALID_PARAMETER);
4413	AssertMsgReturn(idChunkUnmap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkUnmap), VERR_INVALID_PARAMETER);
4414
4415	if ( idChunkMap == NIL_GMM_CHUNKID
4416	&& idChunkUnmap == NIL_GMM_CHUNKID)
4417	return VERR_INVALID_PARAMETER;
4418
4419	if (idChunkMap != NIL_GMM_CHUNKID)
4420	{
4421	AssertPtrReturn(ppvR3, VERR_INVALID_POINTER);
4422	*ppvR3 = NIL_RTR3PTR;
4423	}
4424
4425	/*
4426	* Take the semaphore and do the work.
4427	*
4428	* The unmapping is done last since it's easier to undo a mapping than
4429	* undoing an unmapping. The ring-3 mapping cache cannot not be so big
4430	* that it pushes the user virtual address space to within a chunk of
4431	* it it's limits, so, no problem here.
4432	*/
4433	gmmR0MutexAcquire(pGMM);
4434	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4435	{
4436	PGMMCHUNK pMap = NULL;
4437	if (idChunkMap != NIL_GVM_HANDLE)
4438	{
4439	pMap = gmmR0GetChunk(pGMM, idChunkMap);
4440	if (RT_LIKELY(pMap))
4441	rc = gmmR0MapChunk(pGMM, pGVM, pMap, true /fRelaxedSem/, ppvR3);
4442	else
4443	{
4444	Log(("GMMR0MapUnmapChunk: idChunkMap=%#x\n", idChunkMap));
4445	rc = VERR_GMM_CHUNK_NOT_FOUND;
4446	}
4447	}
4448	/** @todo split this operation, the bail out might (theoretcially) not be
4449	* entirely safe. */
4450
4451	if ( idChunkUnmap != NIL_GMM_CHUNKID
4452	&& RT_SUCCESS(rc))
4453	{
4454	PGMMCHUNK pUnmap = gmmR0GetChunk(pGMM, idChunkUnmap);
4455	if (RT_LIKELY(pUnmap))
4456	rc = gmmR0UnmapChunk(pGMM, pGVM, pUnmap, true /fRelaxedSem/);
4457	else
4458	{
4459	Log(("GMMR0MapUnmapChunk: idChunkUnmap=%#x\n", idChunkUnmap));
4460	rc = VERR_GMM_CHUNK_NOT_FOUND;
4461	}
4462
4463	if (RT_FAILURE(rc) && pMap)
4464	gmmR0UnmapChunk(pGMM, pGVM, pMap, false /fRelaxedSem/);
4465	}
4466
4467	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4468	}
4469	else
4470	rc = VERR_GMM_IS_NOT_SANE;
4471	gmmR0MutexRelease(pGMM);
4472
4473	LogFlow(("GMMR0MapUnmapChunk: returns %Rrc\n", rc));
4474	return rc;
4475	}
4476
4477
4478	/**
4479	* VMMR0 request wrapper for GMMR0MapUnmapChunk.
4480	*
4481	* @returns see GMMR0MapUnmapChunk.
4482	* @param pGVM The global (ring-0) VM structure.
4483	* @param pReq Pointer to the request packet.
4484	*/
4485	GMMR0DECL(int) GMMR0MapUnmapChunkReq(PGVM pGVM, PGMMMAPUNMAPCHUNKREQ pReq)
4486	{
4487	/*
4488	* Validate input and pass it on.
4489	*/
4490	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4491	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
4492
4493	return GMMR0MapUnmapChunk(pGVM, pReq->idChunkMap, pReq->idChunkUnmap, &pReq->pvR3);
4494	}
4495
4496
4497	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
4498	/**
4499	* Gets the ring-0 virtual address for the given page.
4500	*
4501	* This is used by PGM when IEM and such wants to access guest RAM from ring-0.
4502	* One of the ASSUMPTIONS here is that the @a idPage is used by the VM and the
4503	* corresponding chunk will remain valid beyond the call (at least till the EMT
4504	* returns to ring-3).
4505	*
4506	* @returns VBox status code.
4507	* @param pGVM Pointer to the kernel-only VM instace data.
4508	* @param idPage The page ID.
4509	* @param ppv Where to store the address.
4510	* @thread EMT
4511	*/
4512	GMMR0DECL(int) GMMR0PageIdToVirt(PGVM pGVM, uint32_t idPage, void **ppv)
4513	{
4514	*ppv = NULL;
4515	PGMM pGMM;
4516	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4517
4518	uint32_t const idChunk = idPage >> GMM_CHUNKID_SHIFT;
4519
4520	/*
4521	* Start with the per-VM TLB.
4522	*/
4523	RTSpinlockAcquire(pGVM->gmm.s.hChunkTlbSpinLock);
4524
4525	PGMMPERVMCHUNKTLBE pTlbe = &pGVM->gmm.s.aChunkTlbEntries[GMMPERVM_CHUNKTLB_IDX(idChunk)];
4526	PGMMCHUNK pChunk = pTlbe->pChunk;
4527	if ( pChunk != NULL
4528	&& pTlbe->idGeneration == ASMAtomicUoReadU64(&pGMM->idFreeGeneration)
4529	&& pChunk->Core.Key == idChunk)
4530	pGVM->R0Stats.gmm.cChunkTlbHits++; /* hopefully this is a likely outcome */
4531	else
4532	{
4533	pGVM->R0Stats.gmm.cChunkTlbMisses++;
4534
4535	/*
4536	* Look it up in the chunk tree.
4537	*/
4538	RTSpinlockAcquire(pGMM->hSpinLockTree);
4539	pChunk = gmmR0GetChunkLocked(pGMM, idChunk);
4540	if (RT_LIKELY(pChunk))
4541	{
4542	pTlbe->idGeneration = pGMM->idFreeGeneration;
4543	RTSpinlockRelease(pGMM->hSpinLockTree);
4544	pTlbe->pChunk = pChunk;
4545	}
4546	else
4547	{
4548	RTSpinlockRelease(pGMM->hSpinLockTree);
4549	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
4550	AssertMsgFailed(("idPage=%#x\n", idPage));
4551	return VERR_GMM_PAGE_NOT_FOUND;
4552	}
4553	}
4554
4555	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
4556
4557	/*
4558	* Got a chunk, now validate the page ownership and calcuate it's address.
4559	*/
4560	const GMMPAGE * const pPage = &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
4561	if (RT_LIKELY( ( GMM_PAGE_IS_PRIVATE(pPage)
4562	&& pPage->Private.hGVM == pGVM->hSelf)
4563	\|\| GMM_PAGE_IS_SHARED(pPage)))
4564	{
4565	AssertPtr(pChunk->pbMapping);
4566	*ppv = &pChunk->pbMapping[(idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT];
4567	return VINF_SUCCESS;
4568	}
4569	AssertMsgFailed(("idPage=%#x is-private=%RTbool Private.hGVM=%u pGVM->hGVM=%u\n",
4570	idPage, GMM_PAGE_IS_PRIVATE(pPage), pPage->Private.hGVM, pGVM->hSelf));
4571	return VERR_GMM_NOT_PAGE_OWNER;
4572	}
4573	#endif /* !VBOX_WITH_LINEAR_HOST_PHYS_MEM */
4574
4575	#ifdef VBOX_WITH_PAGE_SHARING
4576
4577	# ifdef VBOX_STRICT
4578	/**
4579	* For checksumming shared pages in strict builds.
4580	*
4581	* The purpose is making sure that a page doesn't change.
4582	*
4583	* @returns Checksum, 0 on failure.
4584	* @param pGMM The GMM instance data.
4585	* @param pGVM Pointer to the kernel-only VM instace data.
4586	* @param idPage The page ID.
4587	*/
4588	static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage)
4589	{
4590	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
4591	AssertMsgReturn(pChunk, ("idPage=%#x\n", idPage), 0);
4592
4593	uint8_t *pbChunk;
4594	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4595	return 0;
4596	uint8_t const *pbPage = pbChunk + ((idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4597
4598	return RTCrc32(pbPage, PAGE_SIZE);
4599	}
4600	# endif /* VBOX_STRICT */
4601
4602
4603	/**
4604	* Calculates the module hash value.
4605	*
4606	* @returns Hash value.
4607	* @param pszModuleName The module name.
4608	* @param pszVersion The module version string.
4609	*/
4610	static uint32_t gmmR0ShModCalcHash(const char pszModuleName, const char pszVersion)
4611	{
4612	return RTStrHash1ExN(3, pszModuleName, RTSTR_MAX, "::", (size_t)2, pszVersion, RTSTR_MAX);
4613	}
4614
4615
4616	/**
4617	* Finds a global module.
4618	*
4619	* @returns Pointer to the global module on success, NULL if not found.
4620	* @param pGMM The GMM instance data.
4621	* @param uHash The hash as calculated by gmmR0ShModCalcHash.
4622	* @param cbModule The module size.
4623	* @param enmGuestOS The guest OS type.
4624	* @param cRegions The number of regions.
4625	* @param pszModuleName The module name.
4626	* @param pszVersion The module version.
4627	* @param paRegions The region descriptions.
4628	*/
4629	static PGMMSHAREDMODULE gmmR0ShModFindGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4630	uint32_t cRegions, const char pszModuleName, const char pszVersion,
4631	struct VMMDEVSHAREDREGIONDESC const *paRegions)
4632	{
4633	for (PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTAvllU32Get(&pGMM->pGlobalSharedModuleTree, uHash);
4634	pGblMod;
4635	pGblMod = (PGMMSHAREDMODULE)pGblMod->Core.pList)
4636	{
4637	if (pGblMod->cbModule != cbModule)
4638	continue;
4639	if (pGblMod->enmGuestOS != enmGuestOS)
4640	continue;
4641	if (pGblMod->cRegions != cRegions)
4642	continue;
4643	if (strcmp(pGblMod->szName, pszModuleName))
4644	continue;
4645	if (strcmp(pGblMod->szVersion, pszVersion))
4646	continue;
4647
4648	uint32_t i;
4649	for (i = 0; i < cRegions; i++)
4650	{
4651	uint32_t off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4652	if (pGblMod->aRegions[i].off != off)
4653	break;
4654
4655	uint32_t cb = RT_ALIGN_32(paRegions[i].cbRegion + off, PAGE_SIZE);
4656	if (pGblMod->aRegions[i].cb != cb)
4657	break;
4658	}
4659
4660	if (i == cRegions)
4661	return pGblMod;
4662	}
4663
4664	return NULL;
4665	}
4666
4667
4668	/**
4669	* Creates a new global module.
4670	*
4671	* @returns VBox status code.
4672	* @param pGMM The GMM instance data.
4673	* @param uHash The hash as calculated by gmmR0ShModCalcHash.
4674	* @param cbModule The module size.
4675	* @param enmGuestOS The guest OS type.
4676	* @param cRegions The number of regions.
4677	* @param pszModuleName The module name.
4678	* @param pszVersion The module version.
4679	* @param paRegions The region descriptions.
4680	* @param ppGblMod Where to return the new module on success.
4681	*/
4682	static int gmmR0ShModNewGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4683	uint32_t cRegions, const char pszModuleName, const char pszVersion,
4684	struct VMMDEVSHAREDREGIONDESC const paRegions, PGMMSHAREDMODULE ppGblMod)
4685	{
4686	Log(("gmmR0ShModNewGlobal: %s %s size %#x os %u rgn %u\n", pszModuleName, pszVersion, cbModule, enmGuestOS, cRegions));
4687	if (pGMM->cShareableModules >= GMM_MAX_SHARED_GLOBAL_MODULES)
4688	{
4689	Log(("gmmR0ShModNewGlobal: Too many modules\n"));
4690	return VERR_GMM_TOO_MANY_GLOBAL_MODULES;
4691	}
4692
4693	PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTMemAllocZ(RT_UOFFSETOF_DYN(GMMSHAREDMODULE, aRegions[cRegions]));
4694	if (!pGblMod)
4695	{
4696	Log(("gmmR0ShModNewGlobal: No memory\n"));
4697	return VERR_NO_MEMORY;
4698	}
4699
4700	pGblMod->Core.Key = uHash;
4701	pGblMod->cbModule = cbModule;
4702	pGblMod->cRegions = cRegions;
4703	pGblMod->cUsers = 1;
4704	pGblMod->enmGuestOS = enmGuestOS;
4705	strcpy(pGblMod->szName, pszModuleName);
4706	strcpy(pGblMod->szVersion, pszVersion);
4707
4708	for (uint32_t i = 0; i < cRegions; i++)
4709	{
4710	Log(("gmmR0ShModNewGlobal: rgn[%u]=%RGvLB%#x\n", i, paRegions[i].GCRegionAddr, paRegions[i].cbRegion));
4711	pGblMod->aRegions[i].off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4712	pGblMod->aRegions[i].cb = paRegions[i].cbRegion + pGblMod->aRegions[i].off;
4713	pGblMod->aRegions[i].cb = RT_ALIGN_32(pGblMod->aRegions[i].cb, PAGE_SIZE);
4714	pGblMod->aRegions[i].paidPages = NULL; /* allocated when needed. */
4715	}
4716
4717	bool fInsert = RTAvllU32Insert(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4718	Assert(fInsert); NOREF(fInsert);
4719	pGMM->cShareableModules++;
4720
4721	*ppGblMod = pGblMod;
4722	return VINF_SUCCESS;
4723	}
4724
4725
4726	/**
4727	* Deletes a global module which is no longer referenced by anyone.
4728	*
4729	* @param pGMM The GMM instance data.
4730	* @param pGblMod The module to delete.
4731	*/
4732	static void gmmR0ShModDeleteGlobal(PGMM pGMM, PGMMSHAREDMODULE pGblMod)
4733	{
4734	Assert(pGblMod->cUsers == 0);
4735	Assert(pGMM->cShareableModules > 0 && pGMM->cShareableModules <= GMM_MAX_SHARED_GLOBAL_MODULES);
4736
4737	void *pvTest = RTAvllU32RemoveNode(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4738	Assert(pvTest == pGblMod); NOREF(pvTest);
4739	pGMM->cShareableModules--;
4740
4741	uint32_t i = pGblMod->cRegions;
4742	while (i-- > 0)
4743	{
4744	if (pGblMod->aRegions[i].paidPages)
4745	{
4746	/* We don't doing anything to the pages as they are handled by the
4747	copy-on-write mechanism in PGM. */
4748	RTMemFree(pGblMod->aRegions[i].paidPages);
4749	pGblMod->aRegions[i].paidPages = NULL;
4750	}
4751	}
4752	RTMemFree(pGblMod);
4753	}
4754
4755
4756	static int gmmR0ShModNewPerVM(PGVM pGVM, RTGCPTR GCBaseAddr, uint32_t cRegions, const VMMDEVSHAREDREGIONDESC *paRegions,
4757	PGMMSHAREDMODULEPERVM *ppRecVM)
4758	{
4759	if (pGVM->gmm.s.Stats.cShareableModules >= GMM_MAX_SHARED_PER_VM_MODULES)
4760	return VERR_GMM_TOO_MANY_PER_VM_MODULES;
4761
4762	PGMMSHAREDMODULEPERVM pRecVM;
4763	pRecVM = (PGMMSHAREDMODULEPERVM)RTMemAllocZ(RT_UOFFSETOF_DYN(GMMSHAREDMODULEPERVM, aRegionsGCPtrs[cRegions]));
4764	if (!pRecVM)
4765	return VERR_NO_MEMORY;
4766
4767	pRecVM->Core.Key = GCBaseAddr;
4768	for (uint32_t i = 0; i < cRegions; i++)
4769	pRecVM->aRegionsGCPtrs[i] = paRegions[i].GCRegionAddr;
4770
4771	bool fInsert = RTAvlGCPtrInsert(&pGVM->gmm.s.pSharedModuleTree, &pRecVM->Core);
4772	Assert(fInsert); NOREF(fInsert);
4773	pGVM->gmm.s.Stats.cShareableModules++;
4774
4775	*ppRecVM = pRecVM;
4776	return VINF_SUCCESS;
4777	}
4778
4779
4780	static void gmmR0ShModDeletePerVM(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULEPERVM pRecVM, bool fRemove)
4781	{
4782	/*
4783	* Free the per-VM module.
4784	*/
4785	PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
4786	pRecVM->pGlobalModule = NULL;
4787
4788	if (fRemove)
4789	{
4790	void *pvTest = RTAvlGCPtrRemove(&pGVM->gmm.s.pSharedModuleTree, pRecVM->Core.Key);
4791	Assert(pvTest == &pRecVM->Core); NOREF(pvTest);
4792	}
4793
4794	RTMemFree(pRecVM);
4795
4796	/*
4797	* Release the global module.
4798	* (In the registration bailout case, it might not be.)
4799	*/
4800	if (pGblMod)
4801	{
4802	Assert(pGblMod->cUsers > 0);
4803	pGblMod->cUsers--;
4804	if (pGblMod->cUsers == 0)
4805	gmmR0ShModDeleteGlobal(pGMM, pGblMod);
4806	}
4807	}
4808
4809	#endif /* VBOX_WITH_PAGE_SHARING */
4810
4811	/**
4812	* Registers a new shared module for the VM.
4813	*
4814	* @returns VBox status code.
4815	* @param pGVM The global (ring-0) VM structure.
4816	* @param idCpu The VCPU id.
4817	* @param enmGuestOS The guest OS type.
4818	* @param pszModuleName The module name.
4819	* @param pszVersion The module version.
4820	* @param GCPtrModBase The module base address.
4821	* @param cbModule The module size.
4822	* @param cRegions The mumber of shared region descriptors.
4823	* @param paRegions Pointer to an array of shared region(s).
4824	* @thread EMT(idCpu)
4825	*/
4826	GMMR0DECL(int) GMMR0RegisterSharedModule(PGVM pGVM, VMCPUID idCpu, VBOXOSFAMILY enmGuestOS, char *pszModuleName,
4827	char *pszVersion, RTGCPTR GCPtrModBase, uint32_t cbModule,
4828	uint32_t cRegions, struct VMMDEVSHAREDREGIONDESC const *paRegions)
4829	{
4830	#ifdef VBOX_WITH_PAGE_SHARING
4831	/*
4832	* Validate input and get the basics.
4833	*
4834	* Note! Turns out the module size does necessarily match the size of the
4835	* regions. (iTunes on XP)
4836	*/
4837	PGMM pGMM;
4838	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4839	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4840	if (RT_FAILURE(rc))
4841	return rc;
4842
4843	if (RT_UNLIKELY(cRegions > VMMDEVSHAREDREGIONDESC_MAX))
4844	return VERR_GMM_TOO_MANY_REGIONS;
4845
4846	if (RT_UNLIKELY(cbModule == 0 \|\| cbModule > _1G))
4847	return VERR_GMM_BAD_SHARED_MODULE_SIZE;
4848
4849	uint32_t cbTotal = 0;
4850	for (uint32_t i = 0; i < cRegions; i++)
4851	{
4852	if (RT_UNLIKELY(paRegions[i].cbRegion == 0 \|\| paRegions[i].cbRegion > _1G))
4853	return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4854
4855	cbTotal += paRegions[i].cbRegion;
4856	if (RT_UNLIKELY(cbTotal > _1G))
4857	return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4858	}
4859
4860	AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
4861	if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
4862	return VERR_GMM_MODULE_NAME_TOO_LONG;
4863
4864	AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
4865	if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
4866	return VERR_GMM_MODULE_NAME_TOO_LONG;
4867
4868	uint32_t const uHash = gmmR0ShModCalcHash(pszModuleName, pszVersion);
4869	Log(("GMMR0RegisterSharedModule %s %s base %RGv size %x hash %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule, uHash));
4870
4871	/*
4872	* Take the semaphore and do some more validations.
4873	*/
4874	gmmR0MutexAcquire(pGMM);
4875	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4876	{
4877	/*
4878	* Check if this module is already locally registered and register
4879	* it if it isn't. The base address is a unique module identifier
4880	* locally.
4881	*/
4882	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
4883	bool fNewModule = pRecVM == NULL;
4884	if (fNewModule)
4885	{
4886	rc = gmmR0ShModNewPerVM(pGVM, GCPtrModBase, cRegions, paRegions, &pRecVM);
4887	if (RT_SUCCESS(rc))
4888	{
4889	/*
4890	* Find a matching global module, register a new one if needed.
4891	*/
4892	PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4893	pszModuleName, pszVersion, paRegions);
4894	if (!pGblMod)
4895	{
4896	Assert(fNewModule);
4897	rc = gmmR0ShModNewGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4898	pszModuleName, pszVersion, paRegions, &pGblMod);
4899	if (RT_SUCCESS(rc))
4900	{
4901	pRecVM->pGlobalModule = pGblMod; /* (One referenced returned by gmmR0ShModNewGlobal.) */
4902	Log(("GMMR0RegisterSharedModule: new module %s %s\n", pszModuleName, pszVersion));
4903	}
4904	else
4905	gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /fRemove/);
4906	}
4907	else
4908	{
4909	Assert(pGblMod->cUsers > 0 && pGblMod->cUsers < UINT32_MAX / 2);
4910	pGblMod->cUsers++;
4911	pRecVM->pGlobalModule = pGblMod;
4912
4913	Log(("GMMR0RegisterSharedModule: new per vm module %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4914	}
4915	}
4916	}
4917	else
4918	{
4919	/*
4920	* Attempt to re-register an existing module.
4921	*/
4922	PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4923	pszModuleName, pszVersion, paRegions);
4924	if (pRecVM->pGlobalModule == pGblMod)
4925	{
4926	Log(("GMMR0RegisterSharedModule: already registered %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4927	rc = VINF_GMM_SHARED_MODULE_ALREADY_REGISTERED;
4928	}
4929	else
4930	{
4931	/** @todo may have to unregister+register when this happens in case it's caused
4932	* by VBoxService crashing and being restarted... */
4933	Log(("GMMR0RegisterSharedModule: Address clash!\n"
4934	" incoming at %RGvLB%#x %s %s rgns %u\n"
4935	" existing at %RGvLB%#x %s %s rgns %u\n",
4936	GCPtrModBase, cbModule, pszModuleName, pszVersion, cRegions,
4937	pRecVM->Core.Key, pRecVM->pGlobalModule->cbModule, pRecVM->pGlobalModule->szName,
4938	pRecVM->pGlobalModule->szVersion, pRecVM->pGlobalModule->cRegions));
4939	rc = VERR_GMM_SHARED_MODULE_ADDRESS_CLASH;
4940	}
4941	}
4942	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4943	}
4944	else
4945	rc = VERR_GMM_IS_NOT_SANE;
4946
4947	gmmR0MutexRelease(pGMM);
4948	return rc;
4949	#else
4950
4951	NOREF(pGVM); NOREF(idCpu); NOREF(enmGuestOS); NOREF(pszModuleName); NOREF(pszVersion);
4952	NOREF(GCPtrModBase); NOREF(cbModule); NOREF(cRegions); NOREF(paRegions);
4953	return VERR_NOT_IMPLEMENTED;
4954	#endif
4955	}
4956
4957
4958	/**
4959	* VMMR0 request wrapper for GMMR0RegisterSharedModule.
4960	*
4961	* @returns see GMMR0RegisterSharedModule.
4962	* @param pGVM The global (ring-0) VM structure.
4963	* @param idCpu The VCPU id.
4964	* @param pReq Pointer to the request packet.
4965	*/
4966	GMMR0DECL(int) GMMR0RegisterSharedModuleReq(PGVM pGVM, VMCPUID idCpu, PGMMREGISTERSHAREDMODULEREQ pReq)
4967	{
4968	/*
4969	* Validate input and pass it on.
4970	*/
4971	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4972	AssertMsgReturn( pReq->Hdr.cbReq >= sizeof(*pReq)
4973	&& pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMREGISTERSHAREDMODULEREQ, aRegions[pReq->cRegions]),
4974	("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
4975
4976	/* Pass back return code in the request packet to preserve informational codes. (VMMR3CallR0 chokes on them) */
4977	pReq->rc = GMMR0RegisterSharedModule(pGVM, idCpu, pReq->enmGuestOS, pReq->szName, pReq->szVersion,
4978	pReq->GCBaseAddr, pReq->cbModule, pReq->cRegions, pReq->aRegions);
4979	return VINF_SUCCESS;
4980	}
4981
4982
4983	/**
4984	* Unregisters a shared module for the VM
4985	*
4986	* @returns VBox status code.
4987	* @param pGVM The global (ring-0) VM structure.
4988	* @param idCpu The VCPU id.
4989	* @param pszModuleName The module name.
4990	* @param pszVersion The module version.
4991	* @param GCPtrModBase The module base address.
4992	* @param cbModule The module size.
4993	*/
4994	GMMR0DECL(int) GMMR0UnregisterSharedModule(PGVM pGVM, VMCPUID idCpu, char pszModuleName, char pszVersion,
4995	RTGCPTR GCPtrModBase, uint32_t cbModule)
4996	{
4997	#ifdef VBOX_WITH_PAGE_SHARING
4998	/*
4999	* Validate input and get the basics.
5000	*/
5001	PGMM pGMM;
5002	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5003	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5004	if (RT_FAILURE(rc))
5005	return rc;
5006
5007	AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
5008	AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
5009	if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
5010	return VERR_GMM_MODULE_NAME_TOO_LONG;
5011	if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
5012	return VERR_GMM_MODULE_NAME_TOO_LONG;
5013
5014	Log(("GMMR0UnregisterSharedModule %s %s base=%RGv size %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule));
5015
5016	/*
5017	* Take the semaphore and do some more validations.
5018	*/
5019	gmmR0MutexAcquire(pGMM);
5020	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5021	{
5022	/*
5023	* Locate and remove the specified module.
5024	*/
5025	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
5026	if (pRecVM)
5027	{
5028	/** @todo Do we need to do more validations here, like that the
5029	* name + version + cbModule matches? */
5030	NOREF(cbModule);
5031	Assert(pRecVM->pGlobalModule);
5032	gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /fRemove/);
5033	}
5034	else
5035	rc = VERR_GMM_SHARED_MODULE_NOT_FOUND;
5036
5037	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5038	}
5039	else
5040	rc = VERR_GMM_IS_NOT_SANE;
5041
5042	gmmR0MutexRelease(pGMM);
5043	return rc;
5044	#else
5045
5046	NOREF(pGVM); NOREF(idCpu); NOREF(pszModuleName); NOREF(pszVersion); NOREF(GCPtrModBase); NOREF(cbModule);
5047	return VERR_NOT_IMPLEMENTED;
5048	#endif
5049	}
5050
5051
5052	/**
5053	* VMMR0 request wrapper for GMMR0UnregisterSharedModule.
5054	*
5055	* @returns see GMMR0UnregisterSharedModule.
5056	* @param pGVM The global (ring-0) VM structure.
5057	* @param idCpu The VCPU id.
5058	* @param pReq Pointer to the request packet.
5059	*/
5060	GMMR0DECL(int) GMMR0UnregisterSharedModuleReq(PGVM pGVM, VMCPUID idCpu, PGMMUNREGISTERSHAREDMODULEREQ pReq)
5061	{
5062	/*
5063	* Validate input and pass it on.
5064	*/
5065	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5066	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5067
5068	return GMMR0UnregisterSharedModule(pGVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule);
5069	}
5070
5071	#ifdef VBOX_WITH_PAGE_SHARING
5072
5073	/**
5074	* Increase the use count of a shared page, the page is known to exist and be valid and such.
5075	*
5076	* @param pGMM Pointer to the GMM instance.
5077	* @param pGVM Pointer to the GVM instance.
5078	* @param pPage The page structure.
5079	*/
5080	DECLINLINE(void) gmmR0UseSharedPage(PGMM pGMM, PGVM pGVM, PGMMPAGE pPage)
5081	{
5082	Assert(pGMM->cSharedPages > 0);
5083	Assert(pGMM->cAllocatedPages > 0);
5084
5085	pGMM->cDuplicatePages++;
5086
5087	pPage->Shared.cRefs++;
5088	pGVM->gmm.s.Stats.cSharedPages++;
5089	pGVM->gmm.s.Stats.Allocated.cBasePages++;
5090	}
5091
5092
5093	/**
5094	* Converts a private page to a shared page, the page is known to exist and be valid and such.
5095	*
5096	* @param pGMM Pointer to the GMM instance.
5097	* @param pGVM Pointer to the GVM instance.
5098	* @param HCPhys Host physical address
5099	* @param idPage The Page ID
5100	* @param pPage The page structure.
5101	* @param pPageDesc Shared page descriptor
5102	*/
5103	DECLINLINE(void) gmmR0ConvertToSharedPage(PGMM pGMM, PGVM pGVM, RTHCPHYS HCPhys, uint32_t idPage, PGMMPAGE pPage,
5104	PGMMSHAREDPAGEDESC pPageDesc)
5105	{
5106	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
5107	Assert(pChunk);
5108	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
5109	Assert(GMM_PAGE_IS_PRIVATE(pPage));
5110
5111	pChunk->cPrivate--;
5112	pChunk->cShared++;
5113
5114	pGMM->cSharedPages++;
5115
5116	pGVM->gmm.s.Stats.cSharedPages++;
5117	pGVM->gmm.s.Stats.cPrivatePages--;
5118
5119	/* Modify the page structure. */
5120	pPage->Shared.pfn = (uint32_t)(uint64_t)(HCPhys >> PAGE_SHIFT);
5121	pPage->Shared.cRefs = 1;
5122	#ifdef VBOX_STRICT
5123	pPageDesc->u32StrictChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
5124	pPage->Shared.u14Checksum = pPageDesc->u32StrictChecksum;
5125	#else
5126	NOREF(pPageDesc);
5127	pPage->Shared.u14Checksum = 0;
5128	#endif
5129	pPage->Shared.u2State = GMM_PAGE_STATE_SHARED;
5130	}
5131
5132
5133	static int gmmR0SharedModuleCheckPageFirstTime(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULE pModule,
5134	unsigned idxRegion, unsigned idxPage,
5135	PGMMSHAREDPAGEDESC pPageDesc, PGMMSHAREDREGIONDESC pGlobalRegion)
5136	{
5137	NOREF(pModule);
5138
5139	/* Easy case: just change the internal page type. */
5140	PGMMPAGE pPage = gmmR0GetPage(pGMM, pPageDesc->idPage);
5141	AssertMsgReturn(pPage, ("idPage=%#x (GCPhys=%RGp HCPhys=%RHp idxRegion=%#x idxPage=%#x) #1\n",
5142	pPageDesc->idPage, pPageDesc->GCPhys, pPageDesc->HCPhys, idxRegion, idxPage),
5143	VERR_PGM_PHYS_INVALID_PAGE_ID);
5144	NOREF(idxRegion);
5145
5146	AssertMsg(pPageDesc->GCPhys == (pPage->Private.pfn << 12), ("desc %RGp gmm %RGp\n", pPageDesc->HCPhys, (pPage->Private.pfn << 12)));
5147
5148	gmmR0ConvertToSharedPage(pGMM, pGVM, pPageDesc->HCPhys, pPageDesc->idPage, pPage, pPageDesc);
5149
5150	/* Keep track of these references. */
5151	pGlobalRegion->paidPages[idxPage] = pPageDesc->idPage;
5152
5153	return VINF_SUCCESS;
5154	}
5155
5156	/**
5157	* Checks specified shared module range for changes
5158	*
5159	* Performs the following tasks:
5160	* - If a shared page is new, then it changes the GMM page type to shared and
5161	* returns it in the pPageDesc descriptor.
5162	* - If a shared page already exists, then it checks if the VM page is
5163	* identical and if so frees the VM page and returns the shared page in
5164	* pPageDesc descriptor.
5165	*
5166	* @remarks ASSUMES the caller has acquired the GMM semaphore!!
5167	*
5168	* @returns VBox status code.
5169	* @param pGVM Pointer to the GVM instance data.
5170	* @param pModule Module description
5171	* @param idxRegion Region index
5172	* @param idxPage Page index
5173	* @param pPageDesc Page descriptor
5174	*/
5175	GMMR0DECL(int) GMMR0SharedModuleCheckPage(PGVM pGVM, PGMMSHAREDMODULE pModule, uint32_t idxRegion, uint32_t idxPage,
5176	PGMMSHAREDPAGEDESC pPageDesc)
5177	{
5178	int rc;
5179	PGMM pGMM;
5180	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5181	pPageDesc->u32StrictChecksum = 0;
5182
5183	AssertMsgReturn(idxRegion < pModule->cRegions,
5184	("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
5185	VERR_INVALID_PARAMETER);
5186
5187	uint32_t const cPages = pModule->aRegions[idxRegion].cb >> PAGE_SHIFT;
5188	AssertMsgReturn(idxPage < cPages,
5189	("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
5190	VERR_INVALID_PARAMETER);
5191
5192	LogFlow(("GMMR0SharedModuleCheckRange %s base %RGv region %d idxPage %d\n", pModule->szName, pModule->Core.Key, idxRegion, idxPage));
5193
5194	/*
5195	* First time; create a page descriptor array.
5196	*/
5197	PGMMSHAREDREGIONDESC pGlobalRegion = &pModule->aRegions[idxRegion];
5198	if (!pGlobalRegion->paidPages)
5199	{
5200	Log(("Allocate page descriptor array for %d pages\n", cPages));
5201	pGlobalRegion->paidPages = (uint32_t )RTMemAlloc(cPages sizeof(pGlobalRegion->paidPages[0]));
5202	AssertReturn(pGlobalRegion->paidPages, VERR_NO_MEMORY);
5203
5204	/* Invalidate all descriptors. */
5205	uint32_t i = cPages;
5206	while (i-- > 0)
5207	pGlobalRegion->paidPages[i] = NIL_GMM_PAGEID;
5208	}
5209
5210	/*
5211	* We've seen this shared page for the first time?
5212	*/
5213	if (pGlobalRegion->paidPages[idxPage] == NIL_GMM_PAGEID)
5214	{
5215	Log(("New shared page guest %RGp host %RHp\n", pPageDesc->GCPhys, pPageDesc->HCPhys));
5216	return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
5217	}
5218
5219	/*
5220	* We've seen it before...
5221	*/
5222	Log(("Replace existing page guest %RGp host %RHp id %#x -> id %#x\n",
5223	pPageDesc->GCPhys, pPageDesc->HCPhys, pPageDesc->idPage, pGlobalRegion->paidPages[idxPage]));
5224	Assert(pPageDesc->idPage != pGlobalRegion->paidPages[idxPage]);
5225
5226	/*
5227	* Get the shared page source.
5228	*/
5229	PGMMPAGE pPage = gmmR0GetPage(pGMM, pGlobalRegion->paidPages[idxPage]);
5230	AssertMsgReturn(pPage, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #2\n", pPageDesc->idPage, idxRegion, idxPage),
5231	VERR_PGM_PHYS_INVALID_PAGE_ID);
5232
5233	if (pPage->Common.u2State != GMM_PAGE_STATE_SHARED)
5234	{
5235	/*
5236	* Page was freed at some point; invalidate this entry.
5237	*/
5238	/** @todo this isn't really bullet proof. */
5239	Log(("Old shared page was freed -> create a new one\n"));
5240	pGlobalRegion->paidPages[idxPage] = NIL_GMM_PAGEID;
5241	return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
5242	}
5243
5244	Log(("Replace existing page guest host %RHp -> %RHp\n", pPageDesc->HCPhys, ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT));
5245
5246	/*
5247	* Calculate the virtual address of the local page.
5248	*/
5249	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pPageDesc->idPage >> GMM_CHUNKID_SHIFT);
5250	AssertMsgReturn(pChunk, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #4\n", pPageDesc->idPage, idxRegion, idxPage),
5251	VERR_PGM_PHYS_INVALID_PAGE_ID);
5252
5253	uint8_t *pbChunk;
5254	AssertMsgReturn(gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk),
5255	("idPage=%#x (idxRegion=%#x idxPage=%#x) #3\n", pPageDesc->idPage, idxRegion, idxPage),
5256	VERR_PGM_PHYS_INVALID_PAGE_ID);
5257	uint8_t *pbLocalPage = pbChunk + ((pPageDesc->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5258
5259	/*
5260	* Calculate the virtual address of the shared page.
5261	*/
5262	pChunk = gmmR0GetChunk(pGMM, pGlobalRegion->paidPages[idxPage] >> GMM_CHUNKID_SHIFT);
5263	Assert(pChunk); /* can't fail as gmmR0GetPage succeeded. */
5264
5265	/*
5266	* Get the virtual address of the physical page; map the chunk into the VM
5267	* process if not already done.
5268	*/
5269	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5270	{
5271	Log(("Map chunk into process!\n"));
5272	rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
5273	AssertRCReturn(rc, rc);
5274	}
5275	uint8_t *pbSharedPage = pbChunk + ((pGlobalRegion->paidPages[idxPage] & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5276
5277	#ifdef VBOX_STRICT
5278	pPageDesc->u32StrictChecksum = RTCrc32(pbSharedPage, PAGE_SIZE);
5279	uint32_t uChecksum = pPageDesc->u32StrictChecksum & UINT32_C(0x00003fff);
5280	AssertMsg(!uChecksum \|\| uChecksum == pPage->Shared.u14Checksum \|\| !pPage->Shared.u14Checksum,
5281	("%#x vs %#x - idPage=%#x - %s %s\n", uChecksum, pPage->Shared.u14Checksum,
5282	pGlobalRegion->paidPages[idxPage], pModule->szName, pModule->szVersion));
5283	#endif
5284
5285	/** @todo write ASMMemComparePage. */
5286	if (memcmp(pbSharedPage, pbLocalPage, PAGE_SIZE))
5287	{
5288	Log(("Unexpected differences found between local and shared page; skip\n"));
5289	/* Signal to the caller that this one hasn't changed. */
5290	pPageDesc->idPage = NIL_GMM_PAGEID;
5291	return VINF_SUCCESS;
5292	}
5293
5294	/*
5295	* Free the old local page.
5296	*/
5297	GMMFREEPAGEDESC PageDesc;
5298	PageDesc.idPage = pPageDesc->idPage;
5299	rc = gmmR0FreePages(pGMM, pGVM, 1, &PageDesc, GMMACCOUNT_BASE);
5300	AssertRCReturn(rc, rc);
5301
5302	gmmR0UseSharedPage(pGMM, pGVM, pPage);
5303
5304	/*
5305	* Pass along the new physical address & page id.
5306	*/
5307	pPageDesc->HCPhys = ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT;
5308	pPageDesc->idPage = pGlobalRegion->paidPages[idxPage];
5309
5310	return VINF_SUCCESS;
5311	}
5312
5313
5314	/**
5315	* RTAvlGCPtrDestroy callback.
5316	*
5317	* @returns 0 or VERR_GMM_INSTANCE.
5318	* @param pNode The node to destroy.
5319	* @param pvArgs Pointer to an argument packet.
5320	*/
5321	static DECLCALLBACK(int) gmmR0CleanupSharedModule(PAVLGCPTRNODECORE pNode, void *pvArgs)
5322	{
5323	gmmR0ShModDeletePerVM(((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGMM,
5324	((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGVM,
5325	(PGMMSHAREDMODULEPERVM)pNode,
5326	false /fRemove/);
5327	return VINF_SUCCESS;
5328	}
5329
5330
5331	/**
5332	* Used by GMMR0CleanupVM to clean up shared modules.
5333	*
5334	* This is called without taking the GMM lock so that it can be yielded as
5335	* needed here.
5336	*
5337	* @param pGMM The GMM handle.
5338	* @param pGVM The global VM handle.
5339	*/
5340	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM)
5341	{
5342	gmmR0MutexAcquire(pGMM);
5343	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
5344
5345	GMMR0SHMODPERVMDTORARGS Args;
5346	Args.pGVM = pGVM;
5347	Args.pGMM = pGMM;
5348	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5349
5350	AssertMsg(pGVM->gmm.s.Stats.cShareableModules == 0, ("%d\n", pGVM->gmm.s.Stats.cShareableModules));
5351	pGVM->gmm.s.Stats.cShareableModules = 0;
5352
5353	gmmR0MutexRelease(pGMM);
5354	}
5355
5356	#endif /* VBOX_WITH_PAGE_SHARING */
5357
5358	/**
5359	* Removes all shared modules for the specified VM
5360	*
5361	* @returns VBox status code.
5362	* @param pGVM The global (ring-0) VM structure.
5363	* @param idCpu The VCPU id.
5364	*/
5365	GMMR0DECL(int) GMMR0ResetSharedModules(PGVM pGVM, VMCPUID idCpu)
5366	{
5367	#ifdef VBOX_WITH_PAGE_SHARING
5368	/*
5369	* Validate input and get the basics.
5370	*/
5371	PGMM pGMM;
5372	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5373	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5374	if (RT_FAILURE(rc))
5375	return rc;
5376
5377	/*
5378	* Take the semaphore and do some more validations.
5379	*/
5380	gmmR0MutexAcquire(pGMM);
5381	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5382	{
5383	Log(("GMMR0ResetSharedModules\n"));
5384	GMMR0SHMODPERVMDTORARGS Args;
5385	Args.pGVM = pGVM;
5386	Args.pGMM = pGMM;
5387	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5388	pGVM->gmm.s.Stats.cShareableModules = 0;
5389
5390	rc = VINF_SUCCESS;
5391	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5392	}
5393	else
5394	rc = VERR_GMM_IS_NOT_SANE;
5395
5396	gmmR0MutexRelease(pGMM);
5397	return rc;
5398	#else
5399	RT_NOREF(pGVM, idCpu);
5400	return VERR_NOT_IMPLEMENTED;
5401	#endif
5402	}
5403
5404	#ifdef VBOX_WITH_PAGE_SHARING
5405
5406	/**
5407	* Tree enumeration callback for checking a shared module.
5408	*/
5409	static DECLCALLBACK(int) gmmR0CheckSharedModule(PAVLGCPTRNODECORE pNode, void *pvUser)
5410	{
5411	GMMCHECKSHAREDMODULEINFO pArgs = (GMMCHECKSHAREDMODULEINFO)pvUser;
5412	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)pNode;
5413	PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
5414
5415	Log(("gmmR0CheckSharedModule: check %s %s base=%RGv size=%x\n",
5416	pGblMod->szName, pGblMod->szVersion, pGblMod->Core.Key, pGblMod->cbModule));
5417
5418	int rc = PGMR0SharedModuleCheck(pArgs->pGVM, pArgs->pGVM, pArgs->idCpu, pGblMod, pRecVM->aRegionsGCPtrs);
5419	if (RT_FAILURE(rc))
5420	return rc;
5421	return VINF_SUCCESS;
5422	}
5423
5424	#endif /* VBOX_WITH_PAGE_SHARING */
5425
5426	/**
5427	* Check all shared modules for the specified VM.
5428	*
5429	* @returns VBox status code.
5430	* @param pGVM The global (ring-0) VM structure.
5431	* @param idCpu The calling EMT number.
5432	* @thread EMT(idCpu)
5433	*/
5434	GMMR0DECL(int) GMMR0CheckSharedModules(PGVM pGVM, VMCPUID idCpu)
5435	{
5436	#ifdef VBOX_WITH_PAGE_SHARING
5437	/*
5438	* Validate input and get the basics.
5439	*/
5440	PGMM pGMM;
5441	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5442	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5443	if (RT_FAILURE(rc))
5444	return rc;
5445
5446	# ifndef DEBUG_sandervl
5447	/*
5448	* Take the semaphore and do some more validations.
5449	*/
5450	gmmR0MutexAcquire(pGMM);
5451	# endif
5452	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5453	{
5454	/*
5455	* Walk the tree, checking each module.
5456	*/
5457	Log(("GMMR0CheckSharedModules\n"));
5458
5459	GMMCHECKSHAREDMODULEINFO Args;
5460	Args.pGVM = pGVM;
5461	Args.idCpu = idCpu;
5462	rc = RTAvlGCPtrDoWithAll(&pGVM->gmm.s.pSharedModuleTree, true /* fFromLeft */, gmmR0CheckSharedModule, &Args);
5463
5464	Log(("GMMR0CheckSharedModules done (rc=%Rrc)!\n", rc));
5465	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5466	}
5467	else
5468	rc = VERR_GMM_IS_NOT_SANE;
5469
5470	# ifndef DEBUG_sandervl
5471	gmmR0MutexRelease(pGMM);
5472	# endif
5473	return rc;
5474	#else
5475	RT_NOREF(pGVM, idCpu);
5476	return VERR_NOT_IMPLEMENTED;
5477	#endif
5478	}
5479
5480	#ifdef VBOX_STRICT
5481
5482	/**
5483	* Worker for GMMR0FindDuplicatePageReq.
5484	*
5485	* @returns true if duplicate, false if not.
5486	*/
5487	static bool gmmR0FindDupPageInChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint8_t const *pbSourcePage)
5488	{
5489	bool fFoundDuplicate = false;
5490	/* Only take chunks not mapped into this VM process; not entirely correct. */
5491	uint8_t *pbChunk;
5492	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5493	{
5494	int rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
5495	if (RT_SUCCESS(rc))
5496	{
5497	/*
5498	* Look for duplicate pages
5499	*/
5500	uintptr_t iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
5501	while (iPage-- > 0)
5502	{
5503	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
5504	{
5505	uint8_t *pbDestPage = pbChunk + (iPage << PAGE_SHIFT);
5506	if (!memcmp(pbSourcePage, pbDestPage, PAGE_SIZE))
5507	{
5508	fFoundDuplicate = true;
5509	break;
5510	}
5511	}
5512	}
5513	gmmR0UnmapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/);
5514	}
5515	}
5516	return fFoundDuplicate;
5517	}
5518
5519
5520	/**
5521	* Find a duplicate of the specified page in other active VMs
5522	*
5523	* @returns VBox status code.
5524	* @param pGVM The global (ring-0) VM structure.
5525	* @param pReq Pointer to the request packet.
5526	*/
5527	GMMR0DECL(int) GMMR0FindDuplicatePageReq(PGVM pGVM, PGMMFINDDUPLICATEPAGEREQ pReq)
5528	{
5529	/*
5530	* Validate input and pass it on.
5531	*/
5532	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5533	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5534
5535	PGMM pGMM;
5536	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5537
5538	int rc = GVMMR0ValidateGVM(pGVM);
5539	if (RT_FAILURE(rc))
5540	return rc;
5541
5542	/*
5543	* Take the semaphore and do some more validations.
5544	*/
5545	rc = gmmR0MutexAcquire(pGMM);
5546	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5547	{
5548	uint8_t *pbChunk;
5549	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pReq->idPage >> GMM_CHUNKID_SHIFT);
5550	if (pChunk)
5551	{
5552	if (gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5553	{
5554	uint8_t *pbSourcePage = pbChunk + ((pReq->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5555	PGMMPAGE pPage = gmmR0GetPage(pGMM, pReq->idPage);
5556	if (pPage)
5557	{
5558	/*
5559	* Walk the chunks
5560	*/
5561	pReq->fDuplicate = false;
5562	RTListForEach(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
5563	{
5564	if (gmmR0FindDupPageInChunk(pGMM, pGVM, pChunk, pbSourcePage))
5565	{
5566	pReq->fDuplicate = true;
5567	break;
5568	}
5569	}
5570	}
5571	else
5572	{
5573	AssertFailed();
5574	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
5575	}
5576	}
5577	else
5578	AssertFailed();
5579	}
5580	else
5581	AssertFailed();
5582	}
5583	else
5584	rc = VERR_GMM_IS_NOT_SANE;
5585
5586	gmmR0MutexRelease(pGMM);
5587	return rc;
5588	}
5589
5590	#endif /* VBOX_STRICT */
5591
5592
5593	/**
5594	* Retrieves the GMM statistics visible to the caller.
5595	*
5596	* @returns VBox status code.
5597	*
5598	* @param pStats Where to put the statistics.
5599	* @param pSession The current session.
5600	* @param pGVM The GVM to obtain statistics for. Optional.
5601	*/
5602	GMMR0DECL(int) GMMR0QueryStatistics(PGMMSTATS pStats, PSUPDRVSESSION pSession, PGVM pGVM)
5603	{
5604	LogFlow(("GVMMR0QueryStatistics: pStats=%p pSession=%p pGVM=%p\n", pStats, pSession, pGVM));
5605
5606	/*
5607	* Validate input.
5608	*/
5609	AssertPtrReturn(pSession, VERR_INVALID_POINTER);
5610	AssertPtrReturn(pStats, VERR_INVALID_POINTER);
5611	pStats->cMaxPages = 0; /* (crash before taking the mutex...) */
5612
5613	PGMM pGMM;
5614	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5615
5616	/*
5617	* Validate the VM handle, if not NULL, and lock the GMM.
5618	*/
5619	int rc;
5620	if (pGVM)
5621	{
5622	rc = GVMMR0ValidateGVM(pGVM);
5623	if (RT_FAILURE(rc))
5624	return rc;
5625	}
5626
5627	rc = gmmR0MutexAcquire(pGMM);
5628	if (RT_FAILURE(rc))
5629	return rc;
5630
5631	/*
5632	* Copy out the GMM statistics.
5633	*/
5634	pStats->cMaxPages = pGMM->cMaxPages;
5635	pStats->cReservedPages = pGMM->cReservedPages;
5636	pStats->cOverCommittedPages = pGMM->cOverCommittedPages;
5637	pStats->cAllocatedPages = pGMM->cAllocatedPages;
5638	pStats->cSharedPages = pGMM->cSharedPages;
5639	pStats->cDuplicatePages = pGMM->cDuplicatePages;
5640	pStats->cLeftBehindSharedPages = pGMM->cLeftBehindSharedPages;
5641	pStats->cBalloonedPages = pGMM->cBalloonedPages;
5642	pStats->cChunks = pGMM->cChunks;
5643	pStats->cFreedChunks = pGMM->cFreedChunks;
5644	pStats->cShareableModules = pGMM->cShareableModules;
5645	pStats->idFreeGeneration = pGMM->idFreeGeneration;
5646	RT_ZERO(pStats->au64Reserved);
5647
5648	/*
5649	* Copy out the VM statistics.
5650	*/
5651	if (pGVM)
5652	pStats->VMStats = pGVM->gmm.s.Stats;
5653	else
5654	RT_ZERO(pStats->VMStats);
5655
5656	gmmR0MutexRelease(pGMM);
5657	return rc;
5658	}
5659
5660
5661	/**
5662	* VMMR0 request wrapper for GMMR0QueryStatistics.
5663	*
5664	* @returns see GMMR0QueryStatistics.
5665	* @param pGVM The global (ring-0) VM structure. Optional.
5666	* @param pReq Pointer to the request packet.
5667	*/
5668	GMMR0DECL(int) GMMR0QueryStatisticsReq(PGVM pGVM, PGMMQUERYSTATISTICSSREQ pReq)
5669	{
5670	/*
5671	* Validate input and pass it on.
5672	*/
5673	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5674	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5675
5676	return GMMR0QueryStatistics(&pReq->Stats, pReq->pSession, pGVM);
5677	}
5678
5679
5680	/**
5681	* Resets the specified GMM statistics.
5682	*
5683	* @returns VBox status code.
5684	*
5685	* @param pStats Which statistics to reset, that is, non-zero fields
5686	* indicates which to reset.
5687	* @param pSession The current session.
5688	* @param pGVM The GVM to reset statistics for. Optional.
5689	*/
5690	GMMR0DECL(int) GMMR0ResetStatistics(PCGMMSTATS pStats, PSUPDRVSESSION pSession, PGVM pGVM)
5691	{
5692	NOREF(pStats); NOREF(pSession); NOREF(pGVM);
5693	/* Currently nothing we can reset at the moment. */
5694	return VINF_SUCCESS;
5695	}
5696
5697
5698	/**
5699	* VMMR0 request wrapper for GMMR0ResetStatistics.
5700	*
5701	* @returns see GMMR0ResetStatistics.
5702	* @param pGVM The global (ring-0) VM structure. Optional.
5703	* @param pReq Pointer to the request packet.
5704	*/
5705	GMMR0DECL(int) GMMR0ResetStatisticsReq(PGVM pGVM, PGMMRESETSTATISTICSSREQ pReq)
5706	{
5707	/*
5708	* Validate input and pass it on.
5709	*/
5710	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5711	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5712
5713	return GMMR0ResetStatistics(&pReq->Stats, pReq->pSession, pGVM);
5714	}
5715

Note: See TracBrowser for help on using the repository browser.

source: vbox/trunk/src/VBox/VMM/VMMR0/GMMR0.cpp@ 92386

Download in other formats: