GMMR0.cpp@ 91540

Last change on this file since 91540 was 91540, checked in by vboxsync, 3 years ago
VMM/GVMMR0: Corrected idSharedPage validation in GMMR0AllocateHandyPages and changed a few RT_UNLIKELY constructs to RT_LIKELY (more portable).
Property svn:eol-style set to `native` Property svn:keywords set to `Id Revision`
File size: 200.8 KB

Line
1	/* $Id: GMMR0.cpp 91540 2021-10-04 12:12:32Z vboxsync $ */
2	/** @file
3	* GMM - Global Memory Manager.
4	*/
5
6	/*
7	* Copyright (C) 2007-2020 Oracle Corporation
8	*
9	* This file is part of VirtualBox Open Source Edition (OSE), as
10	* available from http://www.virtualbox.org. This file is free software;
11	* you can redistribute it and/or modify it under the terms of the GNU
12	* General Public License (GPL) as published by the Free Software
13	* Foundation, in version 2 as it comes in the "COPYING" file of the
14	* VirtualBox OSE distribution. VirtualBox OSE is distributed in the
15	* hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
16	*/
17
18
19	/** @page pg_gmm GMM - The Global Memory Manager
20	*
21	* As the name indicates, this component is responsible for global memory
22	* management. Currently only guest RAM is allocated from the GMM, but this
23	* may change to include shadow page tables and other bits later.
24	*
25	* Guest RAM is managed as individual pages, but allocated from the host OS
26	* in chunks for reasons of portability / efficiency. To minimize the memory
27	* footprint all tracking structure must be as small as possible without
28	* unnecessary performance penalties.
29	*
30	* The allocation chunks has fixed sized, the size defined at compile time
31	* by the #GMM_CHUNK_SIZE \#define.
32	*
33	* Each chunk is given an unique ID. Each page also has a unique ID. The
34	* relationship between the two IDs is:
35	* @code
36	* GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
37	* idPage = (idChunk << GMM_CHUNK_SHIFT) \| iPage;
38	* @endcode
39	* Where iPage is the index of the page within the chunk. This ID scheme
40	* permits for efficient chunk and page lookup, but it relies on the chunk size
41	* to be set at compile time. The chunks are organized in an AVL tree with their
42	* IDs being the keys.
43	*
44	* The physical address of each page in an allocation chunk is maintained by
45	* the #RTR0MEMOBJ and obtained using #RTR0MemObjGetPagePhysAddr. There is no
46	* need to duplicate this information (it'll cost 8-bytes per page if we did).
47	*
48	* So what do we need to track per page? Most importantly we need to know
49	* which state the page is in:
50	* - Private - Allocated for (eventually) backing one particular VM page.
51	* - Shared - Readonly page that is used by one or more VMs and treated
52	* as COW by PGM.
53	* - Free - Not used by anyone.
54	*
55	* For the page replacement operations (sharing, defragmenting and freeing)
56	* to be somewhat efficient, private pages needs to be associated with a
57	* particular page in a particular VM.
58	*
59	* Tracking the usage of shared pages is impractical and expensive, so we'll
60	* settle for a reference counting system instead.
61	*
62	* Free pages will be chained on LIFOs
63	*
64	* On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit
65	* systems a 32-bit bitfield will have to suffice because of address space
66	* limitations. The #GMMPAGE structure shows the details.
67	*
68	*
69	* @section sec_gmm_alloc_strat Page Allocation Strategy
70	*
71	* The strategy for allocating pages has to take fragmentation and shared
72	* pages into account, or we may end up with with 2000 chunks with only
73	* a few pages in each. Shared pages cannot easily be reallocated because
74	* of the inaccurate usage accounting (see above). Private pages can be
75	* reallocated by a defragmentation thread in the same manner that sharing
76	* is done.
77	*
78	* The first approach is to manage the free pages in two sets depending on
79	* whether they are mainly for the allocation of shared or private pages.
80	* In the initial implementation there will be almost no possibility for
81	* mixing shared and private pages in the same chunk (only if we're really
82	* stressed on memory), but when we implement forking of VMs and have to
83	* deal with lots of COW pages it'll start getting kind of interesting.
84	*
85	* The sets are lists of chunks with approximately the same number of
86	* free pages. Say the chunk size is 1MB, meaning 256 pages, and a set
87	* consists of 16 lists. So, the first list will contain the chunks with
88	* 1-7 free pages, the second covers 8-15, and so on. The chunks will be
89	* moved between the lists as pages are freed up or allocated.
90	*
91	*
92	* @section sec_gmm_costs Costs
93	*
94	* The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ
95	* entails. In addition there is the chunk cost of approximately
96	* (sizeof(RT0MEMOBJ) + sizeof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.
97	*
98	* On Windows the per page #RTR0MEMOBJ cost is 32-bit on 32-bit windows
99	* and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page.
100	* The cost on Linux is identical, but here it's because of sizeof(struct page *).
101	*
102	*
103	* @section sec_gmm_legacy Legacy Mode for Non-Tier-1 Platforms
104	*
105	* In legacy mode the page source is locked user pages and not
106	* #RTR0MemObjAllocPhysNC, this means that a page can only be allocated
107	* by the VM that locked it. We will make no attempt at implementing
108	* page sharing on these systems, just do enough to make it all work.
109	*
110	* @note With 6.1 really dropping 32-bit support, the legacy mode is obsoleted
111	* under the assumption that there is sufficient kernel virtual address
112	* space to map all of the guest memory allocations. So, we'll be using
113	* #RTR0MemObjAllocPage on some platforms as an alternative to
114	* #RTR0MemObjAllocPhysNC.
115	*
116	*
117	* @subsection sub_gmm_locking Serializing
118	*
119	* One simple fast mutex will be employed in the initial implementation, not
120	* two as mentioned in @ref sec_pgmPhys_Serializing.
121	*
122	* @see @ref sec_pgmPhys_Serializing
123	*
124	*
125	* @section sec_gmm_overcommit Memory Over-Commitment Management
126	*
127	* The GVM will have to do the system wide memory over-commitment
128	* management. My current ideas are:
129	* - Per VM oc policy that indicates how much to initially commit
130	* to it and what to do in a out-of-memory situation.
131	* - Prevent overtaxing the host.
132	*
133	* There are some challenges here, the main ones are configurability and
134	* security. Should we for instance permit anyone to request 100% memory
135	* commitment? Who should be allowed to do runtime adjustments of the
136	* config. And how to prevent these settings from being lost when the last
137	* VM process exits? The solution is probably to have an optional root
138	* daemon the will keep VMMR0.r0 in memory and enable the security measures.
139	*
140	*
141	*
142	* @section sec_gmm_numa NUMA
143	*
144	* NUMA considerations will be designed and implemented a bit later.
145	*
146	* The preliminary guesses is that we will have to try allocate memory as
147	* close as possible to the CPUs the VM is executed on (EMT and additional CPU
148	* threads). Which means it's mostly about allocation and sharing policies.
149	* Both the scheduler and allocator interface will to supply some NUMA info
150	* and we'll need to have a way to calc access costs.
151	*
152	*/
153
154
155	/*********************************************************************************************************************************
156	* Header Files *
157	*********************************************************************************************************************************/
158	#define LOG_GROUP LOG_GROUP_GMM
159	#include <VBox/rawpci.h>
160	#include <VBox/vmm/gmm.h>
161	#include "GMMR0Internal.h"
162	#include <VBox/vmm/vmcc.h>
163	#include <VBox/vmm/pgm.h>
164	#include <VBox/log.h>
165	#include <VBox/param.h>
166	#include <VBox/err.h>
167	#include <VBox/VMMDev.h>
168	#include <iprt/asm.h>
169	#include <iprt/avl.h>
170	#ifdef VBOX_STRICT
171	# include <iprt/crc.h>
172	#endif
173	#include <iprt/critsect.h>
174	#include <iprt/list.h>
175	#include <iprt/mem.h>
176	#include <iprt/memobj.h>
177	#include <iprt/mp.h>
178	#include <iprt/semaphore.h>
179	#include <iprt/spinlock.h>
180	#include <iprt/string.h>
181	#include <iprt/time.h>
182
183
184	/*********************************************************************************************************************************
185	* Defined Constants And Macros *
186	*********************************************************************************************************************************/
187	/** @def VBOX_USE_CRIT_SECT_FOR_GIANT
188	* Use a critical section instead of a fast mutex for the giant GMM lock.
189	*
190	* @remarks This is primarily a way of avoiding the deadlock checks in the
191	* windows driver verifier. */
192	#if defined(RT_OS_WINDOWS) \|\| defined(RT_OS_DARWIN) \|\| defined(DOXYGEN_RUNNING)
193	# define VBOX_USE_CRIT_SECT_FOR_GIANT
194	#endif
195
196	#if defined(VBOX_WITH_LINEAR_HOST_PHYS_MEM) && !defined(RT_OS_DARWIN) && 0
197	/** Enable the legacy mode code (will be dropped soon). */
198	# define GMM_WITH_LEGACY_MODE
199	#endif
200
201
202	/*********************************************************************************************************************************
203	* Structures and Typedefs *
204	*********************************************************************************************************************************/
205	/** Pointer to set of free chunks. */
206	typedef struct GMMCHUNKFREESET *PGMMCHUNKFREESET;
207
208	/**
209	* The per-page tracking structure employed by the GMM.
210	*
211	* On 32-bit hosts we'll some trickery is necessary to compress all
212	* the information into 32-bits. When the fSharedFree member is set,
213	* the 30th bit decides whether it's a free page or not.
214	*
215	* Because of the different layout on 32-bit and 64-bit hosts, macros
216	* are used to get and set some of the data.
217	*/
218	typedef union GMMPAGE
219	{
220	#if HC_ARCH_BITS == 64
221	/** Unsigned integer view. */
222	uint64_t u;
223
224	/** The common view. */
225	struct GMMPAGECOMMON
226	{
227	uint32_t uStuff1 : 32;
228	uint32_t uStuff2 : 30;
229	/** The page state. */
230	uint32_t u2State : 2;
231	} Common;
232
233	/** The view of a private page. */
234	struct GMMPAGEPRIVATE
235	{
236	/** The guest page frame number. (Max addressable: 2 ^ 44 - 16) */
237	uint32_t pfn;
238	/** The GVM handle. (64K VMs) */
239	uint32_t hGVM : 16;
240	/** Reserved. */
241	uint32_t u16Reserved : 14;
242	/** The page state. */
243	uint32_t u2State : 2;
244	} Private;
245
246	/** The view of a shared page. */
247	struct GMMPAGESHARED
248	{
249	/** The host page frame number. (Max addressable: 2 ^ 44 - 16) */
250	uint32_t pfn;
251	/** The reference count (64K VMs). */
252	uint32_t cRefs : 16;
253	/** Used for debug checksumming. */
254	uint32_t u14Checksum : 14;
255	/** The page state. */
256	uint32_t u2State : 2;
257	} Shared;
258
259	/** The view of a free page. */
260	struct GMMPAGEFREE
261	{
262	/** The index of the next page in the free list. UINT16_MAX is NIL. */
263	uint16_t iNext;
264	/** Reserved. Checksum or something? */
265	uint16_t u16Reserved0;
266	/** Reserved. Checksum or something? */
267	uint32_t u30Reserved1 : 30;
268	/** The page state. */
269	uint32_t u2State : 2;
270	} Free;
271
272	#else /* 32-bit */
273	/** Unsigned integer view. */
274	uint32_t u;
275
276	/** The common view. */
277	struct GMMPAGECOMMON
278	{
279	uint32_t uStuff : 30;
280	/** The page state. */
281	uint32_t u2State : 2;
282	} Common;
283
284	/** The view of a private page. */
285	struct GMMPAGEPRIVATE
286	{
287	/** The guest page frame number. (Max addressable: 2 ^ 36) */
288	uint32_t pfn : 24;
289	/** The GVM handle. (127 VMs) */
290	uint32_t hGVM : 7;
291	/** The top page state bit, MBZ. */
292	uint32_t fZero : 1;
293	} Private;
294
295	/** The view of a shared page. */
296	struct GMMPAGESHARED
297	{
298	/** The reference count. */
299	uint32_t cRefs : 30;
300	/** The page state. */
301	uint32_t u2State : 2;
302	} Shared;
303
304	/** The view of a free page. */
305	struct GMMPAGEFREE
306	{
307	/** The index of the next page in the free list. UINT16_MAX is NIL. */
308	uint32_t iNext : 16;
309	/** Reserved. Checksum or something? */
310	uint32_t u14Reserved : 14;
311	/** The page state. */
312	uint32_t u2State : 2;
313	} Free;
314	#endif
315	} GMMPAGE;
316	AssertCompileSize(GMMPAGE, sizeof(RTHCUINTPTR));
317	/** Pointer to a GMMPAGE. */
318	typedef GMMPAGE *PGMMPAGE;
319
320
321	/** @name The Page States.
322	* @{ */
323	/** A private page. */
324	#define GMM_PAGE_STATE_PRIVATE 0
325	/** A private page - alternative value used on the 32-bit implementation.
326	* This will never be used on 64-bit hosts. */
327	#define GMM_PAGE_STATE_PRIVATE_32 1
328	/** A shared page. */
329	#define GMM_PAGE_STATE_SHARED 2
330	/** A free page. */
331	#define GMM_PAGE_STATE_FREE 3
332	/** @} */
333
334
335	/** @def GMM_PAGE_IS_PRIVATE
336	*
337	* @returns true if private, false if not.
338	* @param pPage The GMM page.
339	*/
340	#if HC_ARCH_BITS == 64
341	# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_PRIVATE )
342	#else
343	# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Private.fZero == 0 )
344	#endif
345
346	/** @def GMM_PAGE_IS_SHARED
347	*
348	* @returns true if shared, false if not.
349	* @param pPage The GMM page.
350	*/
351	#define GMM_PAGE_IS_SHARED(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_SHARED )
352
353	/** @def GMM_PAGE_IS_FREE
354	*
355	* @returns true if free, false if not.
356	* @param pPage The GMM page.
357	*/
358	#define GMM_PAGE_IS_FREE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_FREE )
359
360	/** @def GMM_PAGE_PFN_LAST
361	* The last valid guest pfn range.
362	* @remark Some of the values outside the range has special meaning,
363	* see GMM_PAGE_PFN_UNSHAREABLE.
364	*/
365	#if HC_ARCH_BITS == 64
366	# define GMM_PAGE_PFN_LAST UINT32_C(0xfffffff0)
367	#else
368	# define GMM_PAGE_PFN_LAST UINT32_C(0x00fffff0)
369	#endif
370	AssertCompile(GMM_PAGE_PFN_LAST == (GMM_GCPHYS_LAST >> PAGE_SHIFT));
371
372	/** @def GMM_PAGE_PFN_UNSHAREABLE
373	* Indicates that this page isn't used for normal guest memory and thus isn't shareable.
374	*/
375	#if HC_ARCH_BITS == 64
376	# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0xfffffff1)
377	#else
378	# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0x00fffff1)
379	#endif
380	AssertCompile(GMM_PAGE_PFN_UNSHAREABLE == (GMM_GCPHYS_UNSHAREABLE >> PAGE_SHIFT));
381
382
383	/**
384	* A GMM allocation chunk ring-3 mapping record.
385	*
386	* This should really be associated with a session and not a VM, but
387	* it's simpler to associated with a VM and cleanup with the VM object
388	* is destroyed.
389	*/
390	typedef struct GMMCHUNKMAP
391	{
392	/** The mapping object. */
393	RTR0MEMOBJ hMapObj;
394	/** The VM owning the mapping. */
395	PGVM pGVM;
396	} GMMCHUNKMAP;
397	/** Pointer to a GMM allocation chunk mapping. */
398	typedef struct GMMCHUNKMAP *PGMMCHUNKMAP;
399
400
401	/**
402	* A GMM allocation chunk.
403	*/
404	typedef struct GMMCHUNK
405	{
406	/** The AVL node core.
407	* The Key is the chunk ID. (Giant mtx.) */
408	AVLU32NODECORE Core;
409	/** The memory object.
410	* Either from RTR0MemObjAllocPhysNC or RTR0MemObjLockUser depending on
411	* what the host can dish up with. (Chunk mtx protects mapping accesses
412	* and related frees.) */
413	RTR0MEMOBJ hMemObj;
414	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
415	/** Pointer to the kernel mapping. */
416	uint8_t *pbMapping;
417	#endif
418	/** Pointer to the next chunk in the free list. (Giant mtx.) */
419	PGMMCHUNK pFreeNext;
420	/** Pointer to the previous chunk in the free list. (Giant mtx.) */
421	PGMMCHUNK pFreePrev;
422	/** Pointer to the free set this chunk belongs to. NULL for
423	* chunks with no free pages. (Giant mtx.) */
424	PGMMCHUNKFREESET pSet;
425	/** List node in the chunk list (GMM::ChunkList). (Giant mtx.) */
426	RTLISTNODE ListNode;
427	/** Pointer to an array of mappings. (Chunk mtx.) */
428	PGMMCHUNKMAP paMappingsX;
429	/** The number of mappings. (Chunk mtx.) */
430	uint16_t cMappingsX;
431	/** The mapping lock this chunk is using using. UINT16_MAX if nobody is
432	* mapping or freeing anything. (Giant mtx.) */
433	uint8_t volatile iChunkMtx;
434	/** GMM_CHUNK_FLAGS_XXX. (Giant mtx.) */
435	uint8_t fFlags;
436	/** The head of the list of free pages. UINT16_MAX is the NIL value.
437	* (Giant mtx.) */
438	uint16_t iFreeHead;
439	/** The number of free pages. (Giant mtx.) */
440	uint16_t cFree;
441	/** The GVM handle of the VM that first allocated pages from this chunk, this
442	* is used as a preference when there are several chunks to choose from.
443	* When in bound memory mode this isn't a preference any longer. (Giant
444	* mtx.) */
445	uint16_t hGVM;
446	/** The ID of the NUMA node the memory mostly resides on. (Reserved for
447	* future use.) (Giant mtx.) */
448	uint16_t idNumaNode;
449	/** The number of private pages. (Giant mtx.) */
450	uint16_t cPrivate;
451	/** The number of shared pages. (Giant mtx.) */
452	uint16_t cShared;
453	/** The pages. (Giant mtx.) */
454	GMMPAGE aPages[GMM_CHUNK_SIZE >> PAGE_SHIFT];
455	} GMMCHUNK;
456
457	/** Indicates that the NUMA properies of the memory is unknown. */
458	#define GMM_CHUNK_NUMA_ID_UNKNOWN UINT16_C(0xfffe)
459
460	/** @name GMM_CHUNK_FLAGS_XXX - chunk flags.
461	* @{ */
462	/** Indicates that the chunk is a large page (2MB). */
463	#define GMM_CHUNK_FLAGS_LARGE_PAGE UINT16_C(0x0001)
464	#ifdef GMM_WITH_LEGACY_MODE
465	/** Indicates that the chunk was locked rather than allocated directly. */
466	# define GMM_CHUNK_FLAGS_SEEDED UINT16_C(0x0002)
467	#endif
468	/** @} */
469
470
471	/**
472	* An allocation chunk TLB entry.
473	*/
474	typedef struct GMMCHUNKTLBE
475	{
476	/** The chunk id. */
477	uint32_t idChunk;
478	/** Pointer to the chunk. */
479	PGMMCHUNK pChunk;
480	} GMMCHUNKTLBE;
481	/** Pointer to an allocation chunk TLB entry. */
482	typedef GMMCHUNKTLBE *PGMMCHUNKTLBE;
483
484
485	/** The number of entries in the allocation chunk TLB. */
486	#define GMM_CHUNKTLB_ENTRIES 32
487	/** Gets the TLB entry index for the given Chunk ID. */
488	#define GMM_CHUNKTLB_IDX(idChunk) ( (idChunk) & (GMM_CHUNKTLB_ENTRIES - 1) )
489
490	/**
491	* An allocation chunk TLB.
492	*/
493	typedef struct GMMCHUNKTLB
494	{
495	/** The TLB entries. */
496	GMMCHUNKTLBE aEntries[GMM_CHUNKTLB_ENTRIES];
497	} GMMCHUNKTLB;
498	/** Pointer to an allocation chunk TLB. */
499	typedef GMMCHUNKTLB *PGMMCHUNKTLB;
500
501
502	/**
503	* The GMM instance data.
504	*/
505	typedef struct GMM
506	{
507	/** Magic / eye catcher. GMM_MAGIC */
508	uint32_t u32Magic;
509	/** The number of threads waiting on the mutex. */
510	uint32_t cMtxContenders;
511	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
512	/** The critical section protecting the GMM.
513	* More fine grained locking can be implemented later if necessary. */
514	RTCRITSECT GiantCritSect;
515	#else
516	/** The fast mutex protecting the GMM.
517	* More fine grained locking can be implemented later if necessary. */
518	RTSEMFASTMUTEX hMtx;
519	#endif
520	#ifdef VBOX_STRICT
521	/** The current mutex owner. */
522	RTNATIVETHREAD hMtxOwner;
523	#endif
524	/** Spinlock protecting the AVL tree.
525	* @todo Make this a read-write spinlock as we should allow concurrent
526	* lookups. */
527	RTSPINLOCK hSpinLockTree;
528	/** The chunk tree.
529	* Protected by hSpinLockTree. */
530	PAVLU32NODECORE pChunks;
531	/** Chunk freeing generation - incremented whenever a chunk is freed. Used
532	* for validating the per-VM chunk TLB entries. Valid range is 1 to 2^62
533	* (exclusive), though higher numbers may temporarily occure while
534	* invalidating the individual TLBs during wrap-around processing. */
535	uint64_t volatile idFreeGeneration;
536	/** The chunk TLB.
537	* Protected by hSpinLockTree. */
538	GMMCHUNKTLB ChunkTLB;
539	/** The private free set. */
540	GMMCHUNKFREESET PrivateX;
541	/** The shared free set. */
542	GMMCHUNKFREESET Shared;
543
544	/** Shared module tree (global).
545	* @todo separate trees for distinctly different guest OSes. */
546	PAVLLU32NODECORE pGlobalSharedModuleTree;
547	/** Sharable modules (count of nodes in pGlobalSharedModuleTree). */
548	uint32_t cShareableModules;
549
550	/** The chunk list. For simplifying the cleanup process and avoid tree
551	* traversal. */
552	RTLISTANCHOR ChunkList;
553
554	/** The maximum number of pages we're allowed to allocate.
555	* @gcfgm{GMM/MaxPages,64-bit, Direct.}
556	* @gcfgm{GMM/PctPages,32-bit, Relative to the number of host pages.} */
557	uint64_t cMaxPages;
558	/** The number of pages that has been reserved.
559	* The deal is that cReservedPages - cOverCommittedPages <= cMaxPages. */
560	uint64_t cReservedPages;
561	/** The number of pages that we have over-committed in reservations. */
562	uint64_t cOverCommittedPages;
563	/** The number of actually allocated (committed if you like) pages. */
564	uint64_t cAllocatedPages;
565	/** The number of pages that are shared. A subset of cAllocatedPages. */
566	uint64_t cSharedPages;
567	/** The number of pages that are actually shared between VMs. */
568	uint64_t cDuplicatePages;
569	/** The number of pages that are shared that has been left behind by
570	* VMs not doing proper cleanups. */
571	uint64_t cLeftBehindSharedPages;
572	/** The number of allocation chunks.
573	* (The number of pages we've allocated from the host can be derived from this.) */
574	uint32_t cChunks;
575	/** The number of current ballooned pages. */
576	uint64_t cBalloonedPages;
577
578	#ifndef GMM_WITH_LEGACY_MODE
579	# ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
580	/** Whether #RTR0MemObjAllocPhysNC works. */
581	bool fHasWorkingAllocPhysNC;
582	# else
583	bool fPadding;
584	# endif
585	#else
586	/** The legacy allocation mode indicator.
587	* This is determined at initialization time. */
588	bool fLegacyAllocationMode;
589	#endif
590	/** The bound memory mode indicator.
591	* When set, the memory will be bound to a specific VM and never
592	* shared. This is always set if fLegacyAllocationMode is set.
593	* (Also determined at initialization time.) */
594	bool fBoundMemoryMode;
595	/** The number of registered VMs. */
596	uint16_t cRegisteredVMs;
597
598	/** The number of freed chunks ever. This is used a list generation to
599	* avoid restarting the cleanup scanning when the list wasn't modified. */
600	uint32_t cFreedChunks;
601	/** The previous allocated Chunk ID.
602	* Used as a hint to avoid scanning the whole bitmap. */
603	uint32_t idChunkPrev;
604	/** Chunk ID allocation bitmap.
605	* Bits of allocated IDs are set, free ones are clear.
606	* The NIL id (0) is marked allocated. */
607	uint32_t bmChunkId[(GMM_CHUNKID_LAST + 1 + 31) / 32];
608
609	/** The index of the next mutex to use. */
610	uint32_t iNextChunkMtx;
611	/** Chunk locks for reducing lock contention without having to allocate
612	* one lock per chunk. */
613	struct
614	{
615	/** The mutex */
616	RTSEMFASTMUTEX hMtx;
617	/** The number of threads currently using this mutex. */
618	uint32_t volatile cUsers;
619	} aChunkMtx[64];
620	} GMM;
621	/** Pointer to the GMM instance. */
622	typedef GMM *PGMM;
623
624	/** The value of GMM::u32Magic (Katsuhiro Otomo). */
625	#define GMM_MAGIC UINT32_C(0x19540414)
626
627
628	/**
629	* GMM chunk mutex state.
630	*
631	* This is returned by gmmR0ChunkMutexAcquire and is used by the other
632	* gmmR0ChunkMutex* methods.
633	*/
634	typedef struct GMMR0CHUNKMTXSTATE
635	{
636	PGMM pGMM;
637	/** The index of the chunk mutex. */
638	uint8_t iChunkMtx;
639	/** The relevant flags (GMMR0CHUNK_MTX_XXX). */
640	uint8_t fFlags;
641	} GMMR0CHUNKMTXSTATE;
642	/** Pointer to a chunk mutex state. */
643	typedef GMMR0CHUNKMTXSTATE *PGMMR0CHUNKMTXSTATE;
644
645	/** @name GMMR0CHUNK_MTX_XXX
646	* @{ */
647	#define GMMR0CHUNK_MTX_INVALID UINT32_C(0)
648	#define GMMR0CHUNK_MTX_KEEP_GIANT UINT32_C(1)
649	#define GMMR0CHUNK_MTX_RETAKE_GIANT UINT32_C(2)
650	#define GMMR0CHUNK_MTX_DROP_GIANT UINT32_C(3)
651	#define GMMR0CHUNK_MTX_END UINT32_C(4)
652	/** @} */
653
654
655	/** The maximum number of shared modules per-vm. */
656	#define GMM_MAX_SHARED_PER_VM_MODULES 2048
657	/** The maximum number of shared modules GMM is allowed to track. */
658	#define GMM_MAX_SHARED_GLOBAL_MODULES 16834
659
660
661	/**
662	* Argument packet for gmmR0SharedModuleCleanup.
663	*/
664	typedef struct GMMR0SHMODPERVMDTORARGS
665	{
666	PGVM pGVM;
667	PGMM pGMM;
668	} GMMR0SHMODPERVMDTORARGS;
669
670	/**
671	* Argument packet for gmmR0CheckSharedModule.
672	*/
673	typedef struct GMMCHECKSHAREDMODULEINFO
674	{
675	PGVM pGVM;
676	VMCPUID idCpu;
677	} GMMCHECKSHAREDMODULEINFO;
678
679
680	/*********************************************************************************************************************************
681	* Global Variables *
682	*********************************************************************************************************************************/
683	/** Pointer to the GMM instance data. */
684	static PGMM g_pGMM = NULL;
685
686	/** Macro for obtaining and validating the g_pGMM pointer.
687	*
688	* On failure it will return from the invoking function with the specified
689	* return value.
690	*
691	* @param pGMM The name of the pGMM variable.
692	* @param rc The return value on failure. Use VERR_GMM_INSTANCE for VBox
693	* status codes.
694	*/
695	#define GMM_GET_VALID_INSTANCE(pGMM, rc) \
696	do { \
697	(pGMM) = g_pGMM; \
698	AssertPtrReturn((pGMM), (rc)); \
699	AssertMsgReturn((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic), (rc)); \
700	} while (0)
701
702	/** Macro for obtaining and validating the g_pGMM pointer, void function
703	* variant.
704	*
705	* On failure it will return from the invoking function.
706	*
707	* @param pGMM The name of the pGMM variable.
708	*/
709	#define GMM_GET_VALID_INSTANCE_VOID(pGMM) \
710	do { \
711	(pGMM) = g_pGMM; \
712	AssertPtrReturnVoid((pGMM)); \
713	AssertMsgReturnVoid((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic)); \
714	} while (0)
715
716
717	/** @def GMM_CHECK_SANITY_UPON_ENTERING
718	* Checks the sanity of the GMM instance data before making changes.
719	*
720	* This is macro is a stub by default and must be enabled manually in the code.
721	*
722	* @returns true if sane, false if not.
723	* @param pGMM The name of the pGMM variable.
724	*/
725	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
726	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
727	#else
728	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (true)
729	#endif
730
731	/** @def GMM_CHECK_SANITY_UPON_LEAVING
732	* Checks the sanity of the GMM instance data after making changes.
733	*
734	* This is macro is a stub by default and must be enabled manually in the code.
735	*
736	* @returns true if sane, false if not.
737	* @param pGMM The name of the pGMM variable.
738	*/
739	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
740	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
741	#else
742	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (true)
743	#endif
744
745	/** @def GMM_CHECK_SANITY_IN_LOOPS
746	* Checks the sanity of the GMM instance in the allocation loops.
747	*
748	* This is macro is a stub by default and must be enabled manually in the code.
749	*
750	* @returns true if sane, false if not.
751	* @param pGMM The name of the pGMM variable.
752	*/
753	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
754	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
755	#else
756	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (true)
757	#endif
758
759
760	/*********************************************************************************************************************************
761	* Internal Functions *
762	*********************************************************************************************************************************/
763	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM);
764	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
765	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk);
766	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet);
767	DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
768	#ifdef GMMR0_WITH_SANITY_CHECK
769	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo);
770	#endif
771	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem);
772	DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
773	DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
774	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
775	#ifdef VBOX_WITH_PAGE_SHARING
776	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM);
777	# ifdef VBOX_STRICT
778	static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage);
779	# endif
780	#endif
781
782
783
784	/**
785	* Initializes the GMM component.
786	*
787	* This is called when the VMMR0.r0 module is loaded and protected by the
788	* loader semaphore.
789	*
790	* @returns VBox status code.
791	*/
792	GMMR0DECL(int) GMMR0Init(void)
793	{
794	LogFlow(("GMMInit:\n"));
795
796	/*
797	* Allocate the instance data and the locks.
798	*/
799	PGMM pGMM = (PGMM)RTMemAllocZ(sizeof(*pGMM));
800	if (!pGMM)
801	return VERR_NO_MEMORY;
802
803	pGMM->u32Magic = GMM_MAGIC;
804	for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
805	pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
806	RTListInit(&pGMM->ChunkList);
807	ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
808
809	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
810	int rc = RTCritSectInit(&pGMM->GiantCritSect);
811	#else
812	int rc = RTSemFastMutexCreate(&pGMM->hMtx);
813	#endif
814	if (RT_SUCCESS(rc))
815	{
816	unsigned iMtx;
817	for (iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
818	{
819	rc = RTSemFastMutexCreate(&pGMM->aChunkMtx[iMtx].hMtx);
820	if (RT_FAILURE(rc))
821	break;
822	}
823	pGMM->hSpinLockTree = NIL_RTSPINLOCK;
824	if (RT_SUCCESS(rc))
825	rc = RTSpinlockCreate(&pGMM->hSpinLockTree, RTSPINLOCK_FLAGS_INTERRUPT_SAFE, "gmm-chunk-tree");
826	if (RT_SUCCESS(rc))
827	{
828	#ifndef GMM_WITH_LEGACY_MODE
829	/*
830	* Figure out how we're going to allocate stuff (only applicable to
831	* host with linear physical memory mappings).
832	*/
833	pGMM->fBoundMemoryMode = false;
834	# ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
835	pGMM->fHasWorkingAllocPhysNC = false;
836
837	RTR0MEMOBJ hMemObj;
838	rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
839	if (RT_SUCCESS(rc))
840	{
841	rc = RTR0MemObjFree(hMemObj, true);
842	AssertRC(rc);
843	pGMM->fHasWorkingAllocPhysNC = true;
844	}
845	else if (rc != VERR_NOT_SUPPORTED)
846	SUPR0Printf("GMMR0Init: Warning! RTR0MemObjAllocPhysNC(, %u, NIL_RTHCPHYS) -> %d!\n", GMM_CHUNK_SIZE, rc);
847	# endif
848	#else /* GMM_WITH_LEGACY_MODE */
849	/*
850	* Check and see if RTR0MemObjAllocPhysNC works.
851	*/
852	# if 0 /* later, see @bufref{3170}. */
853	RTR0MEMOBJ MemObj;
854	rc = RTR0MemObjAllocPhysNC(&MemObj, _64K, NIL_RTHCPHYS);
855	if (RT_SUCCESS(rc))
856	{
857	rc = RTR0MemObjFree(MemObj, true);
858	AssertRC(rc);
859	}
860	else if (rc == VERR_NOT_SUPPORTED)
861	pGMM->fLegacyAllocationMode = pGMM->fBoundMemoryMode = true;
862	else
863	SUPR0Printf("GMMR0Init: RTR0MemObjAllocPhysNC(,64K,Any) -> %d!\n", rc);
864	# else
865	# if defined(RT_OS_WINDOWS) \|\| (defined(RT_OS_SOLARIS) && ARCH_BITS == 64) \|\| defined(RT_OS_LINUX) \|\| defined(RT_OS_FREEBSD)
866	pGMM->fLegacyAllocationMode = false;
867	# if ARCH_BITS == 32
868	/* Don't reuse possibly partial chunks because of the virtual
869	address space limitation. */
870	pGMM->fBoundMemoryMode = true;
871	# else
872	pGMM->fBoundMemoryMode = false;
873	# endif
874	# else
875	pGMM->fLegacyAllocationMode = true;
876	pGMM->fBoundMemoryMode = true;
877	# endif
878	# endif
879	#endif /* GMM_WITH_LEGACY_MODE */
880
881	/*
882	* Query system page count and guess a reasonable cMaxPages value.
883	*/
884	pGMM->cMaxPages = UINT32_MAX; /** @todo IPRT function for query ram size and such. */
885
886	/*
887	* The idFreeGeneration value should be set so we actually trigger the
888	* wrap-around invalidation handling during a typical test run.
889	*/
890	pGMM->idFreeGeneration = UINT64_MAX / 4 - 128;
891
892	g_pGMM = pGMM;
893	#ifdef GMM_WITH_LEGACY_MODE
894	LogFlow(("GMMInit: pGMM=%p fLegacyAllocationMode=%RTbool fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fLegacyAllocationMode, pGMM->fBoundMemoryMode));
895	#elif defined(VBOX_WITH_LINEAR_HOST_PHYS_MEM)
896	LogFlow(("GMMInit: pGMM=%p fBoundMemoryMode=%RTbool fHasWorkingAllocPhysNC=%RTbool\n", pGMM, pGMM->fBoundMemoryMode, pGMM->fHasWorkingAllocPhysNC));
897	#else
898	LogFlow(("GMMInit: pGMM=%p fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fBoundMemoryMode));
899	#endif
900	return VINF_SUCCESS;
901	}
902
903	/*
904	* Bail out.
905	*/
906	RTSpinlockDestroy(pGMM->hSpinLockTree);
907	while (iMtx-- > 0)
908	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
909	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
910	RTCritSectDelete(&pGMM->GiantCritSect);
911	#else
912	RTSemFastMutexDestroy(pGMM->hMtx);
913	#endif
914	}
915
916	pGMM->u32Magic = 0;
917	RTMemFree(pGMM);
918	SUPR0Printf("GMMR0Init: failed! rc=%d\n", rc);
919	return rc;
920	}
921
922
923	/**
924	* Terminates the GMM component.
925	*/
926	GMMR0DECL(void) GMMR0Term(void)
927	{
928	LogFlow(("GMMTerm:\n"));
929
930	/*
931	* Take care / be paranoid...
932	*/
933	PGMM pGMM = g_pGMM;
934	if (!RT_VALID_PTR(pGMM))
935	return;
936	if (pGMM->u32Magic != GMM_MAGIC)
937	{
938	SUPR0Printf("GMMR0Term: u32Magic=%#x\n", pGMM->u32Magic);
939	return;
940	}
941
942	/*
943	* Undo what init did and free all the resources we've acquired.
944	*/
945	/* Destroy the fundamentals. */
946	g_pGMM = NULL;
947	pGMM->u32Magic = ~GMM_MAGIC;
948	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
949	RTCritSectDelete(&pGMM->GiantCritSect);
950	#else
951	RTSemFastMutexDestroy(pGMM->hMtx);
952	pGMM->hMtx = NIL_RTSEMFASTMUTEX;
953	#endif
954	RTSpinlockDestroy(pGMM->hSpinLockTree);
955	pGMM->hSpinLockTree = NIL_RTSPINLOCK;
956
957	/* Free any chunks still hanging around. */
958	RTAvlU32Destroy(&pGMM->pChunks, gmmR0TermDestroyChunk, pGMM);
959
960	/* Destroy the chunk locks. */
961	for (unsigned iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
962	{
963	Assert(pGMM->aChunkMtx[iMtx].cUsers == 0);
964	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
965	pGMM->aChunkMtx[iMtx].hMtx = NIL_RTSEMFASTMUTEX;
966	}
967
968	/* Finally the instance data itself. */
969	RTMemFree(pGMM);
970	LogFlow(("GMMTerm: done\n"));
971	}
972
973
974	/**
975	* RTAvlU32Destroy callback.
976	*
977	* @returns 0
978	* @param pNode The node to destroy.
979	* @param pvGMM The GMM handle.
980	*/
981	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM)
982	{
983	PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
984
985	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
986	SUPR0Printf("GMMR0Term: %RKv/%#x: cFree=%d cPrivate=%d cShared=%d cMappings=%d\n", pChunk,
987	pChunk->Core.Key, pChunk->cFree, pChunk->cPrivate, pChunk->cShared, pChunk->cMappingsX);
988
989	int rc = RTR0MemObjFree(pChunk->hMemObj, true /* fFreeMappings */);
990	if (RT_FAILURE(rc))
991	{
992	SUPR0Printf("GMMR0Term: %RKv/%#x: RTRMemObjFree(%RKv,true) -> %d (cMappings=%d)\n", pChunk,
993	pChunk->Core.Key, pChunk->hMemObj, rc, pChunk->cMappingsX);
994	AssertRC(rc);
995	}
996	pChunk->hMemObj = NIL_RTR0MEMOBJ;
997
998	RTMemFree(pChunk->paMappingsX);
999	pChunk->paMappingsX = NULL;
1000
1001	RTMemFree(pChunk);
1002	NOREF(pvGMM);
1003	return 0;
1004	}
1005
1006
1007	/**
1008	* Initializes the per-VM data for the GMM.
1009	*
1010	* This is called from within the GVMM lock (from GVMMR0CreateVM)
1011	* and should only initialize the data members so GMMR0CleanupVM
1012	* can deal with them. We reserve no memory or anything here,
1013	* that's done later in GMMR0InitVM.
1014	*
1015	* @param pGVM Pointer to the Global VM structure.
1016	*/
1017	GMMR0DECL(int) GMMR0InitPerVMData(PGVM pGVM)
1018	{
1019	AssertCompile(RT_SIZEOFMEMB(GVM,gmm.s) <= RT_SIZEOFMEMB(GVM,gmm.padding));
1020
1021	pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
1022	pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
1023	pGVM->gmm.s.Stats.fMayAllocate = false;
1024
1025	pGVM->gmm.s.hChunkTlbSpinLock = NIL_RTSPINLOCK;
1026	int rc = RTSpinlockCreate(&pGVM->gmm.s.hChunkTlbSpinLock, RTSPINLOCK_FLAGS_INTERRUPT_SAFE, "per-vm-chunk-tlb");
1027	AssertRCReturn(rc, rc);
1028
1029	return VINF_SUCCESS;
1030	}
1031
1032
1033	/**
1034	* Acquires the GMM giant lock.
1035	*
1036	* @returns Assert status code from RTSemFastMutexRequest.
1037	* @param pGMM Pointer to the GMM instance.
1038	*/
1039	static int gmmR0MutexAcquire(PGMM pGMM)
1040	{
1041	ASMAtomicIncU32(&pGMM->cMtxContenders);
1042	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1043	int rc = RTCritSectEnter(&pGMM->GiantCritSect);
1044	#else
1045	int rc = RTSemFastMutexRequest(pGMM->hMtx);
1046	#endif
1047	ASMAtomicDecU32(&pGMM->cMtxContenders);
1048	AssertRC(rc);
1049	#ifdef VBOX_STRICT
1050	pGMM->hMtxOwner = RTThreadNativeSelf();
1051	#endif
1052	return rc;
1053	}
1054
1055
1056	/**
1057	* Releases the GMM giant lock.
1058	*
1059	* @returns Assert status code from RTSemFastMutexRequest.
1060	* @param pGMM Pointer to the GMM instance.
1061	*/
1062	static int gmmR0MutexRelease(PGMM pGMM)
1063	{
1064	#ifdef VBOX_STRICT
1065	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
1066	#endif
1067	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1068	int rc = RTCritSectLeave(&pGMM->GiantCritSect);
1069	#else
1070	int rc = RTSemFastMutexRelease(pGMM->hMtx);
1071	AssertRC(rc);
1072	#endif
1073	return rc;
1074	}
1075
1076
1077	/**
1078	* Yields the GMM giant lock if there is contention and a certain minimum time
1079	* has elapsed since we took it.
1080	*
1081	* @returns @c true if the mutex was yielded, @c false if not.
1082	* @param pGMM Pointer to the GMM instance.
1083	* @param puLockNanoTS Where the lock acquisition time stamp is kept
1084	* (in/out).
1085	*/
1086	static bool gmmR0MutexYield(PGMM pGMM, uint64_t *puLockNanoTS)
1087	{
1088	/*
1089	* If nobody is contending the mutex, don't bother checking the time.
1090	*/
1091	if (ASMAtomicReadU32(&pGMM->cMtxContenders) == 0)
1092	return false;
1093
1094	/*
1095	* Don't yield if we haven't executed for at least 2 milliseconds.
1096	*/
1097	uint64_t uNanoNow = RTTimeSystemNanoTS();
1098	if (uNanoNow - *puLockNanoTS < UINT32_C(2000000))
1099	return false;
1100
1101	/*
1102	* Yield the mutex.
1103	*/
1104	#ifdef VBOX_STRICT
1105	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
1106	#endif
1107	ASMAtomicIncU32(&pGMM->cMtxContenders);
1108	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1109	int rc1 = RTCritSectLeave(&pGMM->GiantCritSect); AssertRC(rc1);
1110	#else
1111	int rc1 = RTSemFastMutexRelease(pGMM->hMtx); AssertRC(rc1);
1112	#endif
1113
1114	RTThreadYield();
1115
1116	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1117	int rc2 = RTCritSectEnter(&pGMM->GiantCritSect); AssertRC(rc2);
1118	#else
1119	int rc2 = RTSemFastMutexRequest(pGMM->hMtx); AssertRC(rc2);
1120	#endif
1121	*puLockNanoTS = RTTimeSystemNanoTS();
1122	ASMAtomicDecU32(&pGMM->cMtxContenders);
1123	#ifdef VBOX_STRICT
1124	pGMM->hMtxOwner = RTThreadNativeSelf();
1125	#endif
1126
1127	return true;
1128	}
1129
1130
1131	/**
1132	* Acquires a chunk lock.
1133	*
1134	* The caller must own the giant lock.
1135	*
1136	* @returns Assert status code from RTSemFastMutexRequest.
1137	* @param pMtxState The chunk mutex state info. (Avoids
1138	* passing the same flags and stuff around
1139	* for subsequent release and drop-giant
1140	* calls.)
1141	* @param pGMM Pointer to the GMM instance.
1142	* @param pChunk Pointer to the chunk.
1143	* @param fFlags Flags regarding the giant lock, GMMR0CHUNK_MTX_XXX.
1144	*/
1145	static int gmmR0ChunkMutexAcquire(PGMMR0CHUNKMTXSTATE pMtxState, PGMM pGMM, PGMMCHUNK pChunk, uint32_t fFlags)
1146	{
1147	Assert(fFlags > GMMR0CHUNK_MTX_INVALID && fFlags < GMMR0CHUNK_MTX_END);
1148	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1149
1150	pMtxState->pGMM = pGMM;
1151	pMtxState->fFlags = (uint8_t)fFlags;
1152
1153	/*
1154	* Get the lock index and reference the lock.
1155	*/
1156	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1157	uint32_t iChunkMtx = pChunk->iChunkMtx;
1158	if (iChunkMtx == UINT8_MAX)
1159	{
1160	iChunkMtx = pGMM->iNextChunkMtx++;
1161	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1162
1163	/* Try get an unused one... */
1164	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1165	{
1166	iChunkMtx = pGMM->iNextChunkMtx++;
1167	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1168	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1169	{
1170	iChunkMtx = pGMM->iNextChunkMtx++;
1171	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1172	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1173	{
1174	iChunkMtx = pGMM->iNextChunkMtx++;
1175	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1176	}
1177	}
1178	}
1179
1180	pChunk->iChunkMtx = iChunkMtx;
1181	}
1182	AssertCompile(RT_ELEMENTS(pGMM->aChunkMtx) < UINT8_MAX);
1183	pMtxState->iChunkMtx = (uint8_t)iChunkMtx;
1184	ASMAtomicIncU32(&pGMM->aChunkMtx[iChunkMtx].cUsers);
1185
1186	/*
1187	* Drop the giant?
1188	*/
1189	if (fFlags != GMMR0CHUNK_MTX_KEEP_GIANT)
1190	{
1191	/** @todo GMM life cycle cleanup (we may race someone
1192	* destroying and cleaning up GMM)? */
1193	gmmR0MutexRelease(pGMM);
1194	}
1195
1196	/*
1197	* Take the chunk mutex.
1198	*/
1199	int rc = RTSemFastMutexRequest(pGMM->aChunkMtx[iChunkMtx].hMtx);
1200	AssertRC(rc);
1201	return rc;
1202	}
1203
1204
1205	/**
1206	* Releases the GMM giant lock.
1207	*
1208	* @returns Assert status code from RTSemFastMutexRequest.
1209	* @param pMtxState Pointer to the chunk mutex state.
1210	* @param pChunk Pointer to the chunk if it's still
1211	* alive, NULL if it isn't. This is used to deassociate
1212	* the chunk from the mutex on the way out so a new one
1213	* can be selected next time, thus avoiding contented
1214	* mutexes.
1215	*/
1216	static int gmmR0ChunkMutexRelease(PGMMR0CHUNKMTXSTATE pMtxState, PGMMCHUNK pChunk)
1217	{
1218	PGMM pGMM = pMtxState->pGMM;
1219
1220	/*
1221	* Release the chunk mutex and reacquire the giant if requested.
1222	*/
1223	int rc = RTSemFastMutexRelease(pGMM->aChunkMtx[pMtxState->iChunkMtx].hMtx);
1224	AssertRC(rc);
1225	if (pMtxState->fFlags == GMMR0CHUNK_MTX_RETAKE_GIANT)
1226	rc = gmmR0MutexAcquire(pGMM);
1227	else
1228	Assert((pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT) == (pGMM->hMtxOwner == RTThreadNativeSelf()));
1229
1230	/*
1231	* Drop the chunk mutex user reference and deassociate it from the chunk
1232	* when possible.
1233	*/
1234	if ( ASMAtomicDecU32(&pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers) == 0
1235	&& pChunk
1236	&& RT_SUCCESS(rc) )
1237	{
1238	if (pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT)
1239	pChunk->iChunkMtx = UINT8_MAX;
1240	else
1241	{
1242	rc = gmmR0MutexAcquire(pGMM);
1243	if (RT_SUCCESS(rc))
1244	{
1245	if (pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers == 0)
1246	pChunk->iChunkMtx = UINT8_MAX;
1247	rc = gmmR0MutexRelease(pGMM);
1248	}
1249	}
1250	}
1251
1252	pMtxState->pGMM = NULL;
1253	return rc;
1254	}
1255
1256
1257	/**
1258	* Drops the giant GMM lock we kept in gmmR0ChunkMutexAcquire while keeping the
1259	* chunk locked.
1260	*
1261	* This only works if gmmR0ChunkMutexAcquire was called with
1262	* GMMR0CHUNK_MTX_KEEP_GIANT. gmmR0ChunkMutexRelease will retake the giant
1263	* mutex, i.e. behave as if GMMR0CHUNK_MTX_RETAKE_GIANT was used.
1264	*
1265	* @returns VBox status code (assuming success is ok).
1266	* @param pMtxState Pointer to the chunk mutex state.
1267	*/
1268	static int gmmR0ChunkMutexDropGiant(PGMMR0CHUNKMTXSTATE pMtxState)
1269	{
1270	AssertReturn(pMtxState->fFlags == GMMR0CHUNK_MTX_KEEP_GIANT, VERR_GMM_MTX_FLAGS);
1271	Assert(pMtxState->pGMM->hMtxOwner == RTThreadNativeSelf());
1272	pMtxState->fFlags = GMMR0CHUNK_MTX_RETAKE_GIANT;
1273	/** @todo GMM life cycle cleanup (we may race someone
1274	* destroying and cleaning up GMM)? */
1275	return gmmR0MutexRelease(pMtxState->pGMM);
1276	}
1277
1278
1279	/**
1280	* For experimenting with NUMA affinity and such.
1281	*
1282	* @returns The current NUMA Node ID.
1283	*/
1284	static uint16_t gmmR0GetCurrentNumaNodeId(void)
1285	{
1286	#if 1
1287	return GMM_CHUNK_NUMA_ID_UNKNOWN;
1288	#else
1289	return RTMpCpuId() / 16;
1290	#endif
1291	}
1292
1293
1294
1295	/**
1296	* Cleans up when a VM is terminating.
1297	*
1298	* @param pGVM Pointer to the Global VM structure.
1299	*/
1300	GMMR0DECL(void) GMMR0CleanupVM(PGVM pGVM)
1301	{
1302	LogFlow(("GMMR0CleanupVM: pGVM=%p:{.hSelf=%#x}\n", pGVM, pGVM->hSelf));
1303
1304	PGMM pGMM;
1305	GMM_GET_VALID_INSTANCE_VOID(pGMM);
1306
1307	#ifdef VBOX_WITH_PAGE_SHARING
1308	/*
1309	* Clean up all registered shared modules first.
1310	*/
1311	gmmR0SharedModuleCleanup(pGMM, pGVM);
1312	#endif
1313
1314	gmmR0MutexAcquire(pGMM);
1315	uint64_t uLockNanoTS = RTTimeSystemNanoTS();
1316	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
1317
1318	/*
1319	* The policy is 'INVALID' until the initial reservation
1320	* request has been serviced.
1321	*/
1322	if ( pGVM->gmm.s.Stats.enmPolicy > GMMOCPOLICY_INVALID
1323	&& pGVM->gmm.s.Stats.enmPolicy < GMMOCPOLICY_END)
1324	{
1325	/*
1326	* If it's the last VM around, we can skip walking all the chunk looking
1327	* for the pages owned by this VM and instead flush the whole shebang.
1328	*
1329	* This takes care of the eventuality that a VM has left shared page
1330	* references behind (shouldn't happen of course, but you never know).
1331	*/
1332	Assert(pGMM->cRegisteredVMs);
1333	pGMM->cRegisteredVMs--;
1334
1335	/*
1336	* Walk the entire pool looking for pages that belong to this VM
1337	* and leftover mappings. (This'll only catch private pages,
1338	* shared pages will be 'left behind'.)
1339	*/
1340	/** @todo r=bird: This scanning+freeing could be optimized in bound mode! */
1341	uint64_t cPrivatePages = pGVM->gmm.s.Stats.cPrivatePages; /* save */
1342
1343	unsigned iCountDown = 64;
1344	bool fRedoFromStart;
1345	PGMMCHUNK pChunk;
1346	do
1347	{
1348	fRedoFromStart = false;
1349	RTListForEachReverse(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
1350	{
1351	uint32_t const cFreeChunksOld = pGMM->cFreedChunks;
1352	if ( ( !pGMM->fBoundMemoryMode
1353	\|\| pChunk->hGVM == pGVM->hSelf)
1354	&& gmmR0CleanupVMScanChunk(pGMM, pGVM, pChunk))
1355	{
1356	/* We left the giant mutex, so reset the yield counters. */
1357	uLockNanoTS = RTTimeSystemNanoTS();
1358	iCountDown = 64;
1359	}
1360	else
1361	{
1362	/* Didn't leave it, so do normal yielding. */
1363	if (!iCountDown)
1364	gmmR0MutexYield(pGMM, &uLockNanoTS);
1365	else
1366	iCountDown--;
1367	}
1368	if (pGMM->cFreedChunks != cFreeChunksOld)
1369	{
1370	fRedoFromStart = true;
1371	break;
1372	}
1373	}
1374	} while (fRedoFromStart);
1375
1376	if (pGVM->gmm.s.Stats.cPrivatePages)
1377	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x has %#x private pages that cannot be found!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cPrivatePages);
1378
1379	pGMM->cAllocatedPages -= cPrivatePages;
1380
1381	/*
1382	* Free empty chunks.
1383	*/
1384	PGMMCHUNKFREESET pPrivateSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
1385	do
1386	{
1387	fRedoFromStart = false;
1388	iCountDown = 10240;
1389	pChunk = pPrivateSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
1390	while (pChunk)
1391	{
1392	PGMMCHUNK pNext = pChunk->pFreeNext;
1393	Assert(pChunk->cFree == GMM_CHUNK_NUM_PAGES);
1394	if ( !pGMM->fBoundMemoryMode
1395	\|\| pChunk->hGVM == pGVM->hSelf)
1396	{
1397	uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1398	if (gmmR0FreeChunk(pGMM, pGVM, pChunk, true /fRelaxedSem/))
1399	{
1400	/* We've left the giant mutex, restart? (+1 for our unlink) */
1401	fRedoFromStart = pPrivateSet->idGeneration != idGenerationOld + 1;
1402	if (fRedoFromStart)
1403	break;
1404	uLockNanoTS = RTTimeSystemNanoTS();
1405	iCountDown = 10240;
1406	}
1407	}
1408
1409	/* Advance and maybe yield the lock. */
1410	pChunk = pNext;
1411	if (--iCountDown == 0)
1412	{
1413	uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1414	fRedoFromStart = gmmR0MutexYield(pGMM, &uLockNanoTS)
1415	&& pPrivateSet->idGeneration != idGenerationOld;
1416	if (fRedoFromStart)
1417	break;
1418	iCountDown = 10240;
1419	}
1420	}
1421	} while (fRedoFromStart);
1422
1423	/*
1424	* Account for shared pages that weren't freed.
1425	*/
1426	if (pGVM->gmm.s.Stats.cSharedPages)
1427	{
1428	Assert(pGMM->cSharedPages >= pGVM->gmm.s.Stats.cSharedPages);
1429	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x left %#x shared pages behind!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cSharedPages);
1430	pGMM->cLeftBehindSharedPages += pGVM->gmm.s.Stats.cSharedPages;
1431	}
1432
1433	/*
1434	* Clean up balloon statistics in case the VM process crashed.
1435	*/
1436	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
1437	pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
1438
1439	/*
1440	* Update the over-commitment management statistics.
1441	*/
1442	pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1443	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
1444	+ pGVM->gmm.s.Stats.Reserved.cShadowPages;
1445	switch (pGVM->gmm.s.Stats.enmPolicy)
1446	{
1447	case GMMOCPOLICY_NO_OC:
1448	break;
1449	default:
1450	/** @todo Update GMM->cOverCommittedPages */
1451	break;
1452	}
1453	}
1454
1455	/* zap the GVM data. */
1456	pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
1457	pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
1458	pGVM->gmm.s.Stats.fMayAllocate = false;
1459
1460	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1461	gmmR0MutexRelease(pGMM);
1462
1463	/*
1464	* Destroy the spinlock.
1465	*/
1466	RTSPINLOCK hSpinlock = NIL_RTSPINLOCK;
1467	ASMAtomicXchgHandle(&pGVM->gmm.s.hChunkTlbSpinLock, NIL_RTSPINLOCK, &hSpinlock);
1468	RTSpinlockDestroy(hSpinlock);
1469
1470	LogFlow(("GMMR0CleanupVM: returns\n"));
1471	}
1472
1473
1474	/**
1475	* Scan one chunk for private pages belonging to the specified VM.
1476	*
1477	* @note This function may drop the giant mutex!
1478	*
1479	* @returns @c true if we've temporarily dropped the giant mutex, @c false if
1480	* we didn't.
1481	* @param pGMM Pointer to the GMM instance.
1482	* @param pGVM The global VM handle.
1483	* @param pChunk The chunk to scan.
1484	*/
1485	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1486	{
1487	Assert(!pGMM->fBoundMemoryMode \|\| pChunk->hGVM == pGVM->hSelf);
1488
1489	/*
1490	* Look for pages belonging to the VM.
1491	* (Perform some internal checks while we're scanning.)
1492	*/
1493	#ifndef VBOX_STRICT
1494	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
1495	#endif
1496	{
1497	unsigned cPrivate = 0;
1498	unsigned cShared = 0;
1499	unsigned cFree = 0;
1500
1501	gmmR0UnlinkChunk(pChunk); /* avoiding cFreePages updates. */
1502
1503	uint16_t hGVM = pGVM->hSelf;
1504	unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
1505	while (iPage-- > 0)
1506	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
1507	{
1508	if (pChunk->aPages[iPage].Private.hGVM == hGVM)
1509	{
1510	/*
1511	* Free the page.
1512	*
1513	* The reason for not using gmmR0FreePrivatePage here is that we
1514	* must not cause the chunk to be freed from under us - we're in
1515	* an AVL tree walk here.
1516	*/
1517	pChunk->aPages[iPage].u = 0;
1518	pChunk->aPages[iPage].Free.iNext = pChunk->iFreeHead;
1519	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1520	pChunk->iFreeHead = iPage;
1521	pChunk->cPrivate--;
1522	pChunk->cFree++;
1523	pGVM->gmm.s.Stats.cPrivatePages--;
1524	cFree++;
1525	}
1526	else
1527	cPrivate++;
1528	}
1529	else if (GMM_PAGE_IS_FREE(&pChunk->aPages[iPage]))
1530	cFree++;
1531	else
1532	cShared++;
1533
1534	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1535
1536	/*
1537	* Did it add up?
1538	*/
1539	if (RT_UNLIKELY( pChunk->cFree != cFree
1540	\|\| pChunk->cPrivate != cPrivate
1541	\|\| pChunk->cShared != cShared))
1542	{
1543	SUPR0Printf("gmmR0CleanupVMScanChunk: Chunk %RKv/%#x has bogus stats - free=%d/%d private=%d/%d shared=%d/%d\n",
1544	pChunk, pChunk->Core.Key, pChunk->cFree, cFree, pChunk->cPrivate, cPrivate, pChunk->cShared, cShared);
1545	pChunk->cFree = cFree;
1546	pChunk->cPrivate = cPrivate;
1547	pChunk->cShared = cShared;
1548	}
1549	}
1550
1551	/*
1552	* If not in bound memory mode, we should reset the hGVM field
1553	* if it has our handle in it.
1554	*/
1555	if (pChunk->hGVM == pGVM->hSelf)
1556	{
1557	if (!g_pGMM->fBoundMemoryMode)
1558	pChunk->hGVM = NIL_GVM_HANDLE;
1559	else if (pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1560	{
1561	SUPR0Printf("gmmR0CleanupVMScanChunk: %RKv/%#x: cFree=%#x - it should be 0 in bound mode!\n",
1562	pChunk, pChunk->Core.Key, pChunk->cFree);
1563	AssertMsgFailed(("%p/%#x: cFree=%#x - it should be 0 in bound mode!\n", pChunk, pChunk->Core.Key, pChunk->cFree));
1564
1565	gmmR0UnlinkChunk(pChunk);
1566	pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1567	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1568	}
1569	}
1570
1571	/*
1572	* Look for a mapping belonging to the terminating VM.
1573	*/
1574	GMMR0CHUNKMTXSTATE MtxState;
1575	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
1576	unsigned cMappings = pChunk->cMappingsX;
1577	for (unsigned i = 0; i < cMappings; i++)
1578	if (pChunk->paMappingsX[i].pGVM == pGVM)
1579	{
1580	gmmR0ChunkMutexDropGiant(&MtxState);
1581
1582	RTR0MEMOBJ hMemObj = pChunk->paMappingsX[i].hMapObj;
1583
1584	cMappings--;
1585	if (i < cMappings)
1586	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
1587	pChunk->paMappingsX[cMappings].pGVM = NULL;
1588	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
1589	Assert(pChunk->cMappingsX - 1U == cMappings);
1590	pChunk->cMappingsX = cMappings;
1591
1592	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings (NA) */);
1593	if (RT_FAILURE(rc))
1594	{
1595	SUPR0Printf("gmmR0CleanupVMScanChunk: %RKv/%#x: mapping #%x: RTRMemObjFree(%RKv,false) -> %d \n",
1596	pChunk, pChunk->Core.Key, i, hMemObj, rc);
1597	AssertRC(rc);
1598	}
1599
1600	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1601	return true;
1602	}
1603
1604	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1605	return false;
1606	}
1607
1608
1609	/**
1610	* The initial resource reservations.
1611	*
1612	* This will make memory reservations according to policy and priority. If there aren't
1613	* sufficient resources available to sustain the VM this function will fail and all
1614	* future allocations requests will fail as well.
1615	*
1616	* These are just the initial reservations made very very early during the VM creation
1617	* process and will be adjusted later in the GMMR0UpdateReservation call after the
1618	* ring-3 init has completed.
1619	*
1620	* @returns VBox status code.
1621	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1622	* @retval VERR_GMM_
1623	*
1624	* @param pGVM The global (ring-0) VM structure.
1625	* @param idCpu The VCPU id - must be zero.
1626	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1627	* This does not include MMIO2 and similar.
1628	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1629	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1630	* hyper heap, MMIO2 and similar.
1631	* @param enmPolicy The OC policy to use on this VM.
1632	* @param enmPriority The priority in an out-of-memory situation.
1633	*
1634	* @thread The creator thread / EMT(0).
1635	*/
1636	GMMR0DECL(int) GMMR0InitialReservation(PGVM pGVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages,
1637	uint32_t cFixedPages, GMMOCPOLICY enmPolicy, GMMPRIORITY enmPriority)
1638	{
1639	LogFlow(("GMMR0InitialReservation: pGVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x enmPolicy=%d enmPriority=%d\n",
1640	pGVM, cBasePages, cShadowPages, cFixedPages, enmPolicy, enmPriority));
1641
1642	/*
1643	* Validate, get basics and take the semaphore.
1644	*/
1645	AssertReturn(idCpu == 0, VERR_INVALID_CPU_ID);
1646	PGMM pGMM;
1647	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1648	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
1649	if (RT_FAILURE(rc))
1650	return rc;
1651
1652	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1653	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1654	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1655	AssertReturn(enmPolicy > GMMOCPOLICY_INVALID && enmPolicy < GMMOCPOLICY_END, VERR_INVALID_PARAMETER);
1656	AssertReturn(enmPriority > GMMPRIORITY_INVALID && enmPriority < GMMPRIORITY_END, VERR_INVALID_PARAMETER);
1657
1658	gmmR0MutexAcquire(pGMM);
1659	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1660	{
1661	if ( !pGVM->gmm.s.Stats.Reserved.cBasePages
1662	&& !pGVM->gmm.s.Stats.Reserved.cFixedPages
1663	&& !pGVM->gmm.s.Stats.Reserved.cShadowPages)
1664	{
1665	/*
1666	* Check if we can accommodate this.
1667	*/
1668	/* ... later ... */
1669	if (RT_SUCCESS(rc))
1670	{
1671	/*
1672	* Update the records.
1673	*/
1674	pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1675	pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1676	pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1677	pGVM->gmm.s.Stats.enmPolicy = enmPolicy;
1678	pGVM->gmm.s.Stats.enmPriority = enmPriority;
1679	pGVM->gmm.s.Stats.fMayAllocate = true;
1680
1681	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1682	pGMM->cRegisteredVMs++;
1683	}
1684	}
1685	else
1686	rc = VERR_WRONG_ORDER;
1687	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1688	}
1689	else
1690	rc = VERR_GMM_IS_NOT_SANE;
1691	gmmR0MutexRelease(pGMM);
1692	LogFlow(("GMMR0InitialReservation: returns %Rrc\n", rc));
1693	return rc;
1694	}
1695
1696
1697	/**
1698	* VMMR0 request wrapper for GMMR0InitialReservation.
1699	*
1700	* @returns see GMMR0InitialReservation.
1701	* @param pGVM The global (ring-0) VM structure.
1702	* @param idCpu The VCPU id.
1703	* @param pReq Pointer to the request packet.
1704	*/
1705	GMMR0DECL(int) GMMR0InitialReservationReq(PGVM pGVM, VMCPUID idCpu, PGMMINITIALRESERVATIONREQ pReq)
1706	{
1707	/*
1708	* Validate input and pass it on.
1709	*/
1710	AssertPtrReturn(pGVM, VERR_INVALID_POINTER);
1711	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1712	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1713
1714	return GMMR0InitialReservation(pGVM, idCpu, pReq->cBasePages, pReq->cShadowPages,
1715	pReq->cFixedPages, pReq->enmPolicy, pReq->enmPriority);
1716	}
1717
1718
1719	/**
1720	* This updates the memory reservation with the additional MMIO2 and ROM pages.
1721	*
1722	* @returns VBox status code.
1723	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1724	*
1725	* @param pGVM The global (ring-0) VM structure.
1726	* @param idCpu The VCPU id.
1727	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1728	* This does not include MMIO2 and similar.
1729	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1730	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1731	* hyper heap, MMIO2 and similar.
1732	*
1733	* @thread EMT(idCpu)
1734	*/
1735	GMMR0DECL(int) GMMR0UpdateReservation(PGVM pGVM, VMCPUID idCpu, uint64_t cBasePages,
1736	uint32_t cShadowPages, uint32_t cFixedPages)
1737	{
1738	LogFlow(("GMMR0UpdateReservation: pGVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x\n",
1739	pGVM, cBasePages, cShadowPages, cFixedPages));
1740
1741	/*
1742	* Validate, get basics and take the semaphore.
1743	*/
1744	PGMM pGMM;
1745	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1746	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
1747	if (RT_FAILURE(rc))
1748	return rc;
1749
1750	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1751	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1752	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1753
1754	gmmR0MutexAcquire(pGMM);
1755	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1756	{
1757	if ( pGVM->gmm.s.Stats.Reserved.cBasePages
1758	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
1759	&& pGVM->gmm.s.Stats.Reserved.cShadowPages)
1760	{
1761	/*
1762	* Check if we can accommodate this.
1763	*/
1764	/* ... later ... */
1765	if (RT_SUCCESS(rc))
1766	{
1767	/*
1768	* Update the records.
1769	*/
1770	pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1771	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
1772	+ pGVM->gmm.s.Stats.Reserved.cShadowPages;
1773	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1774
1775	pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1776	pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1777	pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1778	}
1779	}
1780	else
1781	rc = VERR_WRONG_ORDER;
1782	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1783	}
1784	else
1785	rc = VERR_GMM_IS_NOT_SANE;
1786	gmmR0MutexRelease(pGMM);
1787	LogFlow(("GMMR0UpdateReservation: returns %Rrc\n", rc));
1788	return rc;
1789	}
1790
1791
1792	/**
1793	* VMMR0 request wrapper for GMMR0UpdateReservation.
1794	*
1795	* @returns see GMMR0UpdateReservation.
1796	* @param pGVM The global (ring-0) VM structure.
1797	* @param idCpu The VCPU id.
1798	* @param pReq Pointer to the request packet.
1799	*/
1800	GMMR0DECL(int) GMMR0UpdateReservationReq(PGVM pGVM, VMCPUID idCpu, PGMMUPDATERESERVATIONREQ pReq)
1801	{
1802	/*
1803	* Validate input and pass it on.
1804	*/
1805	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1806	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1807
1808	return GMMR0UpdateReservation(pGVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages);
1809	}
1810
1811	#ifdef GMMR0_WITH_SANITY_CHECK
1812
1813	/**
1814	* Performs sanity checks on a free set.
1815	*
1816	* @returns Error count.
1817	*
1818	* @param pGMM Pointer to the GMM instance.
1819	* @param pSet Pointer to the set.
1820	* @param pszSetName The set name.
1821	* @param pszFunction The function from which it was called.
1822	* @param uLine The line number.
1823	*/
1824	static uint32_t gmmR0SanityCheckSet(PGMM pGMM, PGMMCHUNKFREESET pSet, const char *pszSetName,
1825	const char *pszFunction, unsigned uLineNo)
1826	{
1827	uint32_t cErrors = 0;
1828
1829	/*
1830	* Count the free pages in all the chunks and match it against pSet->cFreePages.
1831	*/
1832	uint32_t cPages = 0;
1833	for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1834	{
1835	for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1836	{
1837	/** @todo check that the chunk is hash into the right set. */
1838	cPages += pCur->cFree;
1839	}
1840	}
1841	if (RT_UNLIKELY(cPages != pSet->cFreePages))
1842	{
1843	SUPR0Printf("GMM insanity: found %#x pages in the %s set, expected %#x. (%s, line %u)\n",
1844	cPages, pszSetName, pSet->cFreePages, pszFunction, uLineNo);
1845	cErrors++;
1846	}
1847
1848	return cErrors;
1849	}
1850
1851
1852	/**
1853	* Performs some sanity checks on the GMM while owning lock.
1854	*
1855	* @returns Error count.
1856	*
1857	* @param pGMM Pointer to the GMM instance.
1858	* @param pszFunction The function from which it is called.
1859	* @param uLineNo The line number.
1860	*/
1861	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo)
1862	{
1863	uint32_t cErrors = 0;
1864
1865	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->PrivateX, "private", pszFunction, uLineNo);
1866	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Shared, "shared", pszFunction, uLineNo);
1867	/** @todo add more sanity checks. */
1868
1869	return cErrors;
1870	}
1871
1872	#endif /* GMMR0_WITH_SANITY_CHECK */
1873
1874	/**
1875	* Looks up a chunk in the tree and fill in the TLB entry for it.
1876	*
1877	* This is not expected to fail and will bitch if it does.
1878	*
1879	* @returns Pointer to the allocation chunk, NULL if not found.
1880	* @param pGMM Pointer to the GMM instance.
1881	* @param idChunk The ID of the chunk to find.
1882	* @param pTlbe Pointer to the TLB entry.
1883	*
1884	* @note Caller owns spinlock.
1885	*/
1886	static PGMMCHUNK gmmR0GetChunkSlow(PGMM pGMM, uint32_t idChunk, PGMMCHUNKTLBE pTlbe)
1887	{
1888	PGMMCHUNK pChunk = (PGMMCHUNK)RTAvlU32Get(&pGMM->pChunks, idChunk);
1889	AssertMsgReturn(pChunk, ("Chunk %#x not found!\n", idChunk), NULL);
1890	pTlbe->idChunk = idChunk;
1891	pTlbe->pChunk = pChunk;
1892	return pChunk;
1893	}
1894
1895
1896	/**
1897	* Finds a allocation chunk, spin-locked.
1898	*
1899	* This is not expected to fail and will bitch if it does.
1900	*
1901	* @returns Pointer to the allocation chunk, NULL if not found.
1902	* @param pGMM Pointer to the GMM instance.
1903	* @param idChunk The ID of the chunk to find.
1904	*/
1905	DECLINLINE(PGMMCHUNK) gmmR0GetChunkLocked(PGMM pGMM, uint32_t idChunk)
1906	{
1907	/*
1908	* Do a TLB lookup, branch if not in the TLB.
1909	*/
1910	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(idChunk)];
1911	PGMMCHUNK pChunk = pTlbe->pChunk;
1912	if ( pChunk == NULL
1913	\|\| pTlbe->idChunk != idChunk)
1914	pChunk = gmmR0GetChunkSlow(pGMM, idChunk, pTlbe);
1915	return pChunk;
1916	}
1917
1918
1919	/**
1920	* Finds a allocation chunk.
1921	*
1922	* This is not expected to fail and will bitch if it does.
1923	*
1924	* @returns Pointer to the allocation chunk, NULL if not found.
1925	* @param pGMM Pointer to the GMM instance.
1926	* @param idChunk The ID of the chunk to find.
1927	*/
1928	DECLINLINE(PGMMCHUNK) gmmR0GetChunk(PGMM pGMM, uint32_t idChunk)
1929	{
1930	RTSpinlockAcquire(pGMM->hSpinLockTree);
1931	PGMMCHUNK pChunk = gmmR0GetChunkLocked(pGMM, idChunk);
1932	RTSpinlockRelease(pGMM->hSpinLockTree);
1933	return pChunk;
1934	}
1935
1936
1937	/**
1938	* Finds a page.
1939	*
1940	* This is not expected to fail and will bitch if it does.
1941	*
1942	* @returns Pointer to the page, NULL if not found.
1943	* @param pGMM Pointer to the GMM instance.
1944	* @param idPage The ID of the page to find.
1945	*/
1946	DECLINLINE(PGMMPAGE) gmmR0GetPage(PGMM pGMM, uint32_t idPage)
1947	{
1948	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1949	if (RT_LIKELY(pChunk))
1950	return &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
1951	return NULL;
1952	}
1953
1954
1955	#if 0 /* unused */
1956	/**
1957	* Gets the host physical address for a page given by it's ID.
1958	*
1959	* @returns The host physical address or NIL_RTHCPHYS.
1960	* @param pGMM Pointer to the GMM instance.
1961	* @param idPage The ID of the page to find.
1962	*/
1963	DECLINLINE(RTHCPHYS) gmmR0GetPageHCPhys(PGMM pGMM, uint32_t idPage)
1964	{
1965	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1966	if (RT_LIKELY(pChunk))
1967	return RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, idPage & GMM_PAGEID_IDX_MASK);
1968	return NIL_RTHCPHYS;
1969	}
1970	#endif /* unused */
1971
1972
1973	/**
1974	* Selects the appropriate free list given the number of free pages.
1975	*
1976	* @returns Free list index.
1977	* @param cFree The number of free pages in the chunk.
1978	*/
1979	DECLINLINE(unsigned) gmmR0SelectFreeSetList(unsigned cFree)
1980	{
1981	unsigned iList = cFree >> GMM_CHUNK_FREE_SET_SHIFT;
1982	AssertMsg(iList < RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists) / RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists[0]),
1983	("%d (%u)\n", iList, cFree));
1984	return iList;
1985	}
1986
1987
1988	/**
1989	* Unlinks the chunk from the free list it's currently on (if any).
1990	*
1991	* @param pChunk The allocation chunk.
1992	*/
1993	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk)
1994	{
1995	PGMMCHUNKFREESET pSet = pChunk->pSet;
1996	if (RT_LIKELY(pSet))
1997	{
1998	pSet->cFreePages -= pChunk->cFree;
1999	pSet->idGeneration++;
2000
2001	PGMMCHUNK pPrev = pChunk->pFreePrev;
2002	PGMMCHUNK pNext = pChunk->pFreeNext;
2003	if (pPrev)
2004	pPrev->pFreeNext = pNext;
2005	else
2006	pSet->apLists[gmmR0SelectFreeSetList(pChunk->cFree)] = pNext;
2007	if (pNext)
2008	pNext->pFreePrev = pPrev;
2009
2010	pChunk->pSet = NULL;
2011	pChunk->pFreeNext = NULL;
2012	pChunk->pFreePrev = NULL;
2013	}
2014	else
2015	{
2016	Assert(!pChunk->pFreeNext);
2017	Assert(!pChunk->pFreePrev);
2018	Assert(!pChunk->cFree);
2019	}
2020	}
2021
2022
2023	/**
2024	* Links the chunk onto the appropriate free list in the specified free set.
2025	*
2026	* If no free entries, it's not linked into any list.
2027	*
2028	* @param pChunk The allocation chunk.
2029	* @param pSet The free set.
2030	*/
2031	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet)
2032	{
2033	Assert(!pChunk->pSet);
2034	Assert(!pChunk->pFreeNext);
2035	Assert(!pChunk->pFreePrev);
2036
2037	if (pChunk->cFree > 0)
2038	{
2039	pChunk->pSet = pSet;
2040	pChunk->pFreePrev = NULL;
2041	unsigned const iList = gmmR0SelectFreeSetList(pChunk->cFree);
2042	pChunk->pFreeNext = pSet->apLists[iList];
2043	if (pChunk->pFreeNext)
2044	pChunk->pFreeNext->pFreePrev = pChunk;
2045	pSet->apLists[iList] = pChunk;
2046
2047	pSet->cFreePages += pChunk->cFree;
2048	pSet->idGeneration++;
2049	}
2050	}
2051
2052
2053	/**
2054	* Links the chunk onto the appropriate free list in the specified free set.
2055	*
2056	* If no free entries, it's not linked into any list.
2057	*
2058	* @param pGMM Pointer to the GMM instance.
2059	* @param pGVM Pointer to the kernel-only VM instace data.
2060	* @param pChunk The allocation chunk.
2061	*/
2062	DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
2063	{
2064	PGMMCHUNKFREESET pSet;
2065	if (pGMM->fBoundMemoryMode)
2066	pSet = &pGVM->gmm.s.Private;
2067	else if (pChunk->cShared)
2068	pSet = &pGMM->Shared;
2069	else
2070	pSet = &pGMM->PrivateX;
2071	gmmR0LinkChunk(pChunk, pSet);
2072	}
2073
2074
2075	/**
2076	* Frees a Chunk ID.
2077	*
2078	* @param pGMM Pointer to the GMM instance.
2079	* @param idChunk The Chunk ID to free.
2080	*/
2081	static void gmmR0FreeChunkId(PGMM pGMM, uint32_t idChunk)
2082	{
2083	AssertReturnVoid(idChunk != NIL_GMM_CHUNKID);
2084	AssertMsg(ASMBitTest(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk));
2085	ASMAtomicBitClear(&pGMM->bmChunkId[0], idChunk);
2086	}
2087
2088
2089	/**
2090	* Allocates a new Chunk ID.
2091	*
2092	* @returns The Chunk ID.
2093	* @param pGMM Pointer to the GMM instance.
2094	*/
2095	static uint32_t gmmR0AllocateChunkId(PGMM pGMM)
2096	{
2097	AssertCompile(!((GMM_CHUNKID_LAST + 1) & 31)); /* must be a multiple of 32 */
2098	AssertCompile(NIL_GMM_CHUNKID == 0);
2099
2100	/*
2101	* Try the next sequential one.
2102	*/
2103	int32_t idChunk = ++pGMM->idChunkPrev;
2104	#if 0 /** @todo enable this code */
2105	if ( idChunk <= GMM_CHUNKID_LAST
2106	&& idChunk > NIL_GMM_CHUNKID
2107	&& !ASMAtomicBitTestAndSet(&pVMM->bmChunkId[0], idChunk))
2108	return idChunk;
2109	#endif
2110
2111	/*
2112	* Scan sequentially from the last one.
2113	*/
2114	if ( (uint32_t)idChunk < GMM_CHUNKID_LAST
2115	&& idChunk > NIL_GMM_CHUNKID)
2116	{
2117	idChunk = ASMBitNextClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1, idChunk - 1);
2118	if (idChunk > NIL_GMM_CHUNKID)
2119	{
2120	AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
2121	return pGMM->idChunkPrev = idChunk;
2122	}
2123	}
2124
2125	/*
2126	* Ok, scan from the start.
2127	* We're not racing anyone, so there is no need to expect failures or have restart loops.
2128	*/
2129	idChunk = ASMBitFirstClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1);
2130	AssertMsgReturn(idChunk > NIL_GMM_CHUNKID, ("%#x\n", idChunk), NIL_GVM_HANDLE);
2131	AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
2132
2133	return pGMM->idChunkPrev = idChunk;
2134	}
2135
2136
2137	/**
2138	* Allocates one private page.
2139	*
2140	* Worker for gmmR0AllocatePages.
2141	*
2142	* @param pChunk The chunk to allocate it from.
2143	* @param hGVM The GVM handle of the VM requesting memory.
2144	* @param pPageDesc The page descriptor.
2145	*/
2146	static void gmmR0AllocatePage(PGMMCHUNK pChunk, uint32_t hGVM, PGMMPAGEDESC pPageDesc)
2147	{
2148	/* update the chunk stats. */
2149	if (pChunk->hGVM == NIL_GVM_HANDLE)
2150	pChunk->hGVM = hGVM;
2151	Assert(pChunk->cFree);
2152	pChunk->cFree--;
2153	pChunk->cPrivate++;
2154
2155	/* unlink the first free page. */
2156	const uint32_t iPage = pChunk->iFreeHead;
2157	AssertReleaseMsg(iPage < RT_ELEMENTS(pChunk->aPages), ("%d\n", iPage));
2158	PGMMPAGE pPage = &pChunk->aPages[iPage];
2159	Assert(GMM_PAGE_IS_FREE(pPage));
2160	pChunk->iFreeHead = pPage->Free.iNext;
2161	Log3(("A pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x iNext=%#x\n",
2162	pPage, iPage, (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage,
2163	pPage->Common.u2State, pChunk->iFreeHead, pPage->Free.iNext));
2164
2165	/* make the page private. */
2166	pPage->u = 0;
2167	AssertCompile(GMM_PAGE_STATE_PRIVATE == 0);
2168	pPage->Private.hGVM = hGVM;
2169	AssertCompile(NIL_RTHCPHYS >= GMM_GCPHYS_LAST);
2170	AssertCompile(GMM_GCPHYS_UNSHAREABLE >= GMM_GCPHYS_LAST);
2171	if (pPageDesc->HCPhysGCPhys <= GMM_GCPHYS_LAST)
2172	pPage->Private.pfn = pPageDesc->HCPhysGCPhys >> PAGE_SHIFT;
2173	else
2174	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
2175
2176	/* update the page descriptor. */
2177	pPageDesc->HCPhysGCPhys = RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, iPage);
2178	Assert(pPageDesc->HCPhysGCPhys != NIL_RTHCPHYS);
2179	pPageDesc->idPage = (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage;
2180	pPageDesc->idSharedPage = NIL_GMM_PAGEID;
2181	}
2182
2183
2184	/**
2185	* Picks the free pages from a chunk.
2186	*
2187	* @returns The new page descriptor table index.
2188	* @param pChunk The chunk.
2189	* @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2190	* affinity.
2191	* @param iPage The current page descriptor table index.
2192	* @param cPages The total number of pages to allocate.
2193	* @param paPages The page descriptor table (input + ouput).
2194	*/
2195	static uint32_t gmmR0AllocatePagesFromChunk(PGMMCHUNK pChunk, uint16_t const hGVM, uint32_t iPage, uint32_t cPages,
2196	PGMMPAGEDESC paPages)
2197	{
2198	PGMMCHUNKFREESET pSet = pChunk->pSet; Assert(pSet);
2199	gmmR0UnlinkChunk(pChunk);
2200
2201	for (; pChunk->cFree && iPage < cPages; iPage++)
2202	gmmR0AllocatePage(pChunk, hGVM, &paPages[iPage]);
2203
2204	gmmR0LinkChunk(pChunk, pSet);
2205	return iPage;
2206	}
2207
2208
2209	/**
2210	* Registers a new chunk of memory.
2211	*
2212	* This is called by both gmmR0AllocateOneChunk and GMMR0SeedChunk.
2213	*
2214	* @returns VBox status code. On success, the giant GMM lock will be held, the
2215	* caller must release it (ugly).
2216	* @param pGMM Pointer to the GMM instance.
2217	* @param pSet Pointer to the set.
2218	* @param hMemObj The memory object for the chunk.
2219	* @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2220	* affinity.
2221	* @param fChunkFlags The chunk flags, GMM_CHUNK_FLAGS_XXX.
2222	* @param ppChunk Chunk address (out). Optional.
2223	*
2224	* @remarks The caller must not own the giant GMM mutex.
2225	* The giant GMM mutex will be acquired and returned acquired in
2226	* the success path. On failure, no locks will be held.
2227	*/
2228	static int gmmR0RegisterChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, RTR0MEMOBJ hMemObj, uint16_t hGVM, uint16_t fChunkFlags,
2229	PGMMCHUNK *ppChunk)
2230	{
2231	Assert(pGMM->hMtxOwner != RTThreadNativeSelf());
2232	Assert(hGVM != NIL_GVM_HANDLE \|\| pGMM->fBoundMemoryMode);
2233	#ifdef GMM_WITH_LEGACY_MODE
2234	Assert(fChunkFlags == 0 \|\| fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE \|\| fChunkFlags == GMM_CHUNK_FLAGS_SEEDED);
2235	#else
2236	Assert(fChunkFlags == 0 \|\| fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE);
2237	#endif
2238
2239	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2240	/*
2241	* Get a ring-0 mapping of the object.
2242	*/
2243	# ifdef GMM_WITH_LEGACY_MODE
2244	uint8_t pbMapping = !(fChunkFlags & GMM_CHUNK_FLAGS_SEEDED) ? (uint8_t )RTR0MemObjAddress(hMemObj) : NULL;
2245	# else
2246	uint8_t pbMapping = (uint8_t )RTR0MemObjAddress(hMemObj);
2247	# endif
2248	if (!pbMapping)
2249	{
2250	RTR0MEMOBJ hMapObj;
2251	int rc = RTR0MemObjMapKernel(&hMapObj, hMemObj, (void *)-1, 0, RTMEM_PROT_READ \| RTMEM_PROT_WRITE);
2252	if (RT_SUCCESS(rc))
2253	pbMapping = (uint8_t *)RTR0MemObjAddress(hMapObj);
2254	else
2255	return rc;
2256	AssertPtr(pbMapping);
2257	}
2258	#endif
2259
2260	/*
2261	* Allocate a chunk.
2262	*/
2263	int rc;
2264	PGMMCHUNK pChunk = (PGMMCHUNK)RTMemAllocZ(sizeof(*pChunk));
2265	if (pChunk)
2266	{
2267	/*
2268	* Initialize it.
2269	*/
2270	pChunk->hMemObj = hMemObj;
2271	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2272	pChunk->pbMapping = pbMapping;
2273	#endif
2274	pChunk->cFree = GMM_CHUNK_NUM_PAGES;
2275	pChunk->hGVM = hGVM;
2276	/pChunk->iFreeHead = 0;/
2277	pChunk->idNumaNode = gmmR0GetCurrentNumaNodeId();
2278	pChunk->iChunkMtx = UINT8_MAX;
2279	pChunk->fFlags = fChunkFlags;
2280	for (unsigned iPage = 0; iPage < RT_ELEMENTS(pChunk->aPages) - 1; iPage++)
2281	{
2282	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
2283	pChunk->aPages[iPage].Free.iNext = iPage + 1;
2284	}
2285	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.u2State = GMM_PAGE_STATE_FREE;
2286	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.iNext = UINT16_MAX;
2287
2288	/*
2289	* Allocate a Chunk ID and insert it into the tree.
2290	* This has to be done behind the mutex of course.
2291	*/
2292	rc = gmmR0MutexAcquire(pGMM);
2293	if (RT_SUCCESS(rc))
2294	{
2295	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2296	{
2297	pChunk->Core.Key = gmmR0AllocateChunkId(pGMM);
2298	if ( pChunk->Core.Key != NIL_GMM_CHUNKID
2299	&& pChunk->Core.Key <= GMM_CHUNKID_LAST)
2300	{
2301	RTSpinlockAcquire(pGMM->hSpinLockTree);
2302	if (RTAvlU32Insert(&pGMM->pChunks, &pChunk->Core))
2303	{
2304	pGMM->cChunks++;
2305	RTListAppend(&pGMM->ChunkList, &pChunk->ListNode);
2306	RTSpinlockRelease(pGMM->hSpinLockTree);
2307
2308	gmmR0LinkChunk(pChunk, pSet);
2309
2310	LogFlow(("gmmR0RegisterChunk: pChunk=%p id=%#x cChunks=%d\n", pChunk, pChunk->Core.Key, pGMM->cChunks));
2311
2312	if (ppChunk)
2313	*ppChunk = pChunk;
2314	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2315	return VINF_SUCCESS;
2316	}
2317	RTSpinlockRelease(pGMM->hSpinLockTree);
2318	}
2319
2320	/* bail out */
2321	rc = VERR_GMM_CHUNK_INSERT;
2322	}
2323	else
2324	rc = VERR_GMM_IS_NOT_SANE;
2325	gmmR0MutexRelease(pGMM);
2326	}
2327
2328	RTMemFree(pChunk);
2329	}
2330	else
2331	rc = VERR_NO_MEMORY;
2332	return rc;
2333	}
2334
2335
2336	/**
2337	* Allocate a new chunk, immediately pick the requested pages from it, and adds
2338	* what's remaining to the specified free set.
2339	*
2340	* @note This will leave the giant mutex while allocating the new chunk!
2341	*
2342	* @returns VBox status code.
2343	* @param pGMM Pointer to the GMM instance data.
2344	* @param pGVM Pointer to the kernel-only VM instace data.
2345	* @param pSet Pointer to the free set.
2346	* @param cPages The number of pages requested.
2347	* @param paPages The page descriptor table (input + output).
2348	* @param piPage The pointer to the page descriptor table index variable.
2349	* This will be updated.
2350	*/
2351	static int gmmR0AllocateChunkNew(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet, uint32_t cPages,
2352	PGMMPAGEDESC paPages, uint32_t *piPage)
2353	{
2354	gmmR0MutexRelease(pGMM);
2355
2356	RTR0MEMOBJ hMemObj;
2357	#ifndef GMM_WITH_LEGACY_MODE
2358	int rc;
2359	# ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2360	if (pGMM->fHasWorkingAllocPhysNC)
2361	rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2362	else
2363	# endif
2364	rc = RTR0MemObjAllocPage(&hMemObj, GMM_CHUNK_SIZE, false /fExecutable/);
2365	#else
2366	int rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2367	#endif
2368	if (RT_SUCCESS(rc))
2369	{
2370	/** @todo Duplicate gmmR0RegisterChunk here so we can avoid chaining up the
2371	* free pages first and then unchaining them right afterwards. Instead
2372	* do as much work as possible without holding the giant lock. */
2373	PGMMCHUNK pChunk;
2374	rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, 0 /fChunkFlags/, &pChunk);
2375	if (RT_SUCCESS(rc))
2376	{
2377	piPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, piPage, cPages, paPages);
2378	return VINF_SUCCESS;
2379	}
2380
2381	/* bail out */
2382	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
2383	}
2384
2385	int rc2 = gmmR0MutexAcquire(pGMM);
2386	AssertRCReturn(rc2, RT_FAILURE(rc) ? rc : rc2);
2387	return rc;
2388
2389	}
2390
2391
2392	/**
2393	* As a last restort we'll pick any page we can get.
2394	*
2395	* @returns The new page descriptor table index.
2396	* @param pSet The set to pick from.
2397	* @param pGVM Pointer to the global VM structure.
2398	* @param iPage The current page descriptor table index.
2399	* @param cPages The total number of pages to allocate.
2400	* @param paPages The page descriptor table (input + ouput).
2401	*/
2402	static uint32_t gmmR0AllocatePagesIndiscriminately(PGMMCHUNKFREESET pSet, PGVM pGVM,
2403	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2404	{
2405	unsigned iList = RT_ELEMENTS(pSet->apLists);
2406	while (iList-- > 0)
2407	{
2408	PGMMCHUNK pChunk = pSet->apLists[iList];
2409	while (pChunk)
2410	{
2411	PGMMCHUNK pNext = pChunk->pFreeNext;
2412
2413	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2414	if (iPage >= cPages)
2415	return iPage;
2416
2417	pChunk = pNext;
2418	}
2419	}
2420	return iPage;
2421	}
2422
2423
2424	/**
2425	* Pick pages from empty chunks on the same NUMA node.
2426	*
2427	* @returns The new page descriptor table index.
2428	* @param pSet The set to pick from.
2429	* @param pGVM Pointer to the global VM structure.
2430	* @param iPage The current page descriptor table index.
2431	* @param cPages The total number of pages to allocate.
2432	* @param paPages The page descriptor table (input + ouput).
2433	*/
2434	static uint32_t gmmR0AllocatePagesFromEmptyChunksOnSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2435	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2436	{
2437	PGMMCHUNK pChunk = pSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
2438	if (pChunk)
2439	{
2440	uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2441	while (pChunk)
2442	{
2443	PGMMCHUNK pNext = pChunk->pFreeNext;
2444
2445	if (pChunk->idNumaNode == idNumaNode)
2446	{
2447	pChunk->hGVM = pGVM->hSelf;
2448	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2449	if (iPage >= cPages)
2450	{
2451	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2452	return iPage;
2453	}
2454	}
2455
2456	pChunk = pNext;
2457	}
2458	}
2459	return iPage;
2460	}
2461
2462
2463	/**
2464	* Pick pages from non-empty chunks on the same NUMA node.
2465	*
2466	* @returns The new page descriptor table index.
2467	* @param pSet The set to pick from.
2468	* @param pGVM Pointer to the global VM structure.
2469	* @param iPage The current page descriptor table index.
2470	* @param cPages The total number of pages to allocate.
2471	* @param paPages The page descriptor table (input + ouput).
2472	*/
2473	static uint32_t gmmR0AllocatePagesFromSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2474	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2475	{
2476	/** @todo start by picking from chunks with about the right size first? */
2477	uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2478	unsigned iList = GMM_CHUNK_FREE_SET_UNUSED_LIST;
2479	while (iList-- > 0)
2480	{
2481	PGMMCHUNK pChunk = pSet->apLists[iList];
2482	while (pChunk)
2483	{
2484	PGMMCHUNK pNext = pChunk->pFreeNext;
2485
2486	if (pChunk->idNumaNode == idNumaNode)
2487	{
2488	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2489	if (iPage >= cPages)
2490	{
2491	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2492	return iPage;
2493	}
2494	}
2495
2496	pChunk = pNext;
2497	}
2498	}
2499	return iPage;
2500	}
2501
2502
2503	/**
2504	* Pick pages that are in chunks already associated with the VM.
2505	*
2506	* @returns The new page descriptor table index.
2507	* @param pGMM Pointer to the GMM instance data.
2508	* @param pGVM Pointer to the global VM structure.
2509	* @param pSet The set to pick from.
2510	* @param iPage The current page descriptor table index.
2511	* @param cPages The total number of pages to allocate.
2512	* @param paPages The page descriptor table (input + ouput).
2513	*/
2514	static uint32_t gmmR0AllocatePagesAssociatedWithVM(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet,
2515	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2516	{
2517	uint16_t const hGVM = pGVM->hSelf;
2518
2519	/* Hint. */
2520	if (pGVM->gmm.s.idLastChunkHint != NIL_GMM_CHUNKID)
2521	{
2522	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pGVM->gmm.s.idLastChunkHint);
2523	if (pChunk && pChunk->cFree)
2524	{
2525	iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2526	if (iPage >= cPages)
2527	return iPage;
2528	}
2529	}
2530
2531	/* Scan. */
2532	for (unsigned iList = 0; iList < RT_ELEMENTS(pSet->apLists); iList++)
2533	{
2534	PGMMCHUNK pChunk = pSet->apLists[iList];
2535	while (pChunk)
2536	{
2537	PGMMCHUNK pNext = pChunk->pFreeNext;
2538
2539	if (pChunk->hGVM == hGVM)
2540	{
2541	iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2542	if (iPage >= cPages)
2543	{
2544	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2545	return iPage;
2546	}
2547	}
2548
2549	pChunk = pNext;
2550	}
2551	}
2552	return iPage;
2553	}
2554
2555
2556
2557	/**
2558	* Pick pages in bound memory mode.
2559	*
2560	* @returns The new page descriptor table index.
2561	* @param pGVM Pointer to the global VM structure.
2562	* @param iPage The current page descriptor table index.
2563	* @param cPages The total number of pages to allocate.
2564	* @param paPages The page descriptor table (input + ouput).
2565	*/
2566	static uint32_t gmmR0AllocatePagesInBoundMode(PGVM pGVM, uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2567	{
2568	for (unsigned iList = 0; iList < RT_ELEMENTS(pGVM->gmm.s.Private.apLists); iList++)
2569	{
2570	PGMMCHUNK pChunk = pGVM->gmm.s.Private.apLists[iList];
2571	while (pChunk)
2572	{
2573	Assert(pChunk->hGVM == pGVM->hSelf);
2574	PGMMCHUNK pNext = pChunk->pFreeNext;
2575	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2576	if (iPage >= cPages)
2577	return iPage;
2578	pChunk = pNext;
2579	}
2580	}
2581	return iPage;
2582	}
2583
2584
2585	/**
2586	* Checks if we should start picking pages from chunks of other VMs because
2587	* we're getting close to the system memory or reserved limit.
2588	*
2589	* @returns @c true if we should, @c false if we should first try allocate more
2590	* chunks.
2591	*/
2592	static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(PGVM pGVM)
2593	{
2594	/*
2595	* Don't allocate a new chunk if we're
2596	*/
2597	uint64_t cPgReserved = pGVM->gmm.s.Stats.Reserved.cBasePages
2598	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
2599	- pGVM->gmm.s.Stats.cBalloonedPages
2600	/** @todo what about shared pages? */;
2601	uint64_t cPgAllocated = pGVM->gmm.s.Stats.Allocated.cBasePages
2602	+ pGVM->gmm.s.Stats.Allocated.cFixedPages;
2603	uint64_t cPgDelta = cPgReserved - cPgAllocated;
2604	if (cPgDelta < GMM_CHUNK_NUM_PAGES * 4)
2605	return true;
2606	/** @todo make the threshold configurable, also test the code to see if
2607	* this ever kicks in (we might be reserving too much or smth). */
2608
2609	/*
2610	* Check how close we're to the max memory limit and how many fragments
2611	* there are?...
2612	*/
2613	/** @todo */
2614
2615	return false;
2616	}
2617
2618
2619	/**
2620	* Checks if we should start picking pages from chunks of other VMs because
2621	* there is a lot of free pages around.
2622	*
2623	* @returns @c true if we should, @c false if we should first try allocate more
2624	* chunks.
2625	*/
2626	static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(PGMM pGMM)
2627	{
2628	/*
2629	* Setting the limit at 16 chunks (32 MB) at the moment.
2630	*/
2631	if (pGMM->PrivateX.cFreePages >= GMM_CHUNK_NUM_PAGES * 16)
2632	return true;
2633	return false;
2634	}
2635
2636
2637	/**
2638	* Common worker for GMMR0AllocateHandyPages and GMMR0AllocatePages.
2639	*
2640	* @returns VBox status code:
2641	* @retval VINF_SUCCESS on success.
2642	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk or
2643	* gmmR0AllocateMoreChunks is necessary.
2644	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2645	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2646	* that is we're trying to allocate more than we've reserved.
2647	*
2648	* @param pGMM Pointer to the GMM instance data.
2649	* @param pGVM Pointer to the VM.
2650	* @param cPages The number of pages to allocate.
2651	* @param paPages Pointer to the page descriptors. See GMMPAGEDESC for
2652	* details on what is expected on input.
2653	* @param enmAccount The account to charge.
2654	*
2655	* @remarks Call takes the giant GMM lock.
2656	*/
2657	static int gmmR0AllocatePagesNew(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2658	{
2659	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
2660
2661	/*
2662	* Check allocation limits.
2663	*/
2664	if (RT_LIKELY(pGMM->cAllocatedPages + cPages <= pGMM->cMaxPages))
2665	{ /* likely */ }
2666	else
2667	return VERR_GMM_HIT_GLOBAL_LIMIT;
2668
2669	switch (enmAccount)
2670	{
2671	case GMMACCOUNT_BASE:
2672	if (RT_LIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
2673	<= pGVM->gmm.s.Stats.Reserved.cBasePages))
2674	{ /* likely */ }
2675	else
2676	{
2677	Log(("gmmR0AllocatePages:Base: Reserved=%#llx Allocated+Ballooned+Requested=%#llx+%#llx+%#x!\n",
2678	pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages,
2679	pGVM->gmm.s.Stats.cBalloonedPages, cPages));
2680	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2681	}
2682	break;
2683	case GMMACCOUNT_SHADOW:
2684	if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages + cPages <= pGVM->gmm.s.Stats.Reserved.cShadowPages))
2685	{ /* likely */ }
2686	else
2687	{
2688	Log(("gmmR0AllocatePages:Shadow: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2689	pGVM->gmm.s.Stats.Reserved.cShadowPages, pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
2690	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2691	}
2692	break;
2693	case GMMACCOUNT_FIXED:
2694	if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages + cPages <= pGVM->gmm.s.Stats.Reserved.cFixedPages))
2695	{ /* likely */ }
2696	else
2697	{
2698	Log(("gmmR0AllocatePages:Fixed: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2699	pGVM->gmm.s.Stats.Reserved.cFixedPages, pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
2700	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2701	}
2702	break;
2703	default:
2704	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2705	}
2706
2707	#ifdef GMM_WITH_LEGACY_MODE
2708	/*
2709	* If we're in legacy memory mode, it's easy to figure if we have
2710	* sufficient number of pages up-front.
2711	*/
2712	if ( pGMM->fLegacyAllocationMode
2713	&& pGVM->gmm.s.Private.cFreePages < cPages)
2714	{
2715	Assert(pGMM->fBoundMemoryMode);
2716	return VERR_GMM_SEED_ME;
2717	}
2718	#endif
2719
2720	/*
2721	* Update the accounts before we proceed because we might be leaving the
2722	* protection of the global mutex and thus run the risk of permitting
2723	* too much memory to be allocated.
2724	*/
2725	switch (enmAccount)
2726	{
2727	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages += cPages; break;
2728	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages += cPages; break;
2729	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages += cPages; break;
2730	default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2731	}
2732	pGVM->gmm.s.Stats.cPrivatePages += cPages;
2733	pGMM->cAllocatedPages += cPages;
2734
2735	#ifdef GMM_WITH_LEGACY_MODE
2736	/*
2737	* Part two of it's-easy-in-legacy-memory-mode.
2738	*/
2739	if (pGMM->fLegacyAllocationMode)
2740	{
2741	uint32_t iPage = gmmR0AllocatePagesInBoundMode(pGVM, 0, cPages, paPages);
2742	AssertReleaseReturn(iPage == cPages, VERR_GMM_ALLOC_PAGES_IPE);
2743	return VINF_SUCCESS;
2744	}
2745	#endif
2746
2747	/*
2748	* Bound mode is also relatively straightforward.
2749	*/
2750	uint32_t iPage = 0;
2751	int rc = VINF_SUCCESS;
2752	if (pGMM->fBoundMemoryMode)
2753	{
2754	iPage = gmmR0AllocatePagesInBoundMode(pGVM, iPage, cPages, paPages);
2755	if (iPage < cPages)
2756	do
2757	rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGVM->gmm.s.Private, cPages, paPages, &iPage);
2758	while (iPage < cPages && RT_SUCCESS(rc));
2759	}
2760	/*
2761	* Shared mode is trickier as we should try archive the same locality as
2762	* in bound mode, but smartly make use of non-full chunks allocated by
2763	* other VMs if we're low on memory.
2764	*/
2765	else
2766	{
2767	/* Pick the most optimal pages first. */
2768	iPage = gmmR0AllocatePagesAssociatedWithVM(pGMM, pGVM, &pGMM->PrivateX, iPage, cPages, paPages);
2769	if (iPage < cPages)
2770	{
2771	/* Maybe we should try getting pages from chunks "belonging" to
2772	other VMs before allocating more chunks? */
2773	bool fTriedOnSameAlready = false;
2774	if (gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(pGVM))
2775	{
2776	iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2777	fTriedOnSameAlready = true;
2778	}
2779
2780	/* Allocate memory from empty chunks. */
2781	if (iPage < cPages)
2782	iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2783
2784	/* Grab empty shared chunks. */
2785	if (iPage < cPages)
2786	iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2787
2788	/* If there is a lof of free pages spread around, try not waste
2789	system memory on more chunks. (Should trigger defragmentation.) */
2790	if ( !fTriedOnSameAlready
2791	&& gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(pGMM))
2792	{
2793	iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2794	if (iPage < cPages)
2795	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2796	}
2797
2798	/*
2799	* Ok, try allocate new chunks.
2800	*/
2801	if (iPage < cPages)
2802	{
2803	do
2804	rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGMM->PrivateX, cPages, paPages, &iPage);
2805	while (iPage < cPages && RT_SUCCESS(rc));
2806
2807	/* If the host is out of memory, take whatever we can get. */
2808	if ( (rc == VERR_NO_MEMORY \|\| rc == VERR_NO_PHYS_MEMORY)
2809	&& pGMM->PrivateX.cFreePages + pGMM->Shared.cFreePages >= cPages - iPage)
2810	{
2811	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2812	if (iPage < cPages)
2813	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2814	AssertRelease(iPage == cPages);
2815	rc = VINF_SUCCESS;
2816	}
2817	}
2818	}
2819	}
2820
2821	/*
2822	* Clean up on failure. Since this is bound to be a low-memory condition
2823	* we will give back any empty chunks that might be hanging around.
2824	*/
2825	if (RT_SUCCESS(rc))
2826	{ /* likely */ }
2827	else
2828	{
2829	/* Update the statistics. */
2830	pGVM->gmm.s.Stats.cPrivatePages -= cPages;
2831	pGMM->cAllocatedPages -= cPages - iPage;
2832	switch (enmAccount)
2833	{
2834	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages; break;
2835	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= cPages; break;
2836	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= cPages; break;
2837	default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2838	}
2839
2840	/* Release the pages. */
2841	while (iPage-- > 0)
2842	{
2843	uint32_t idPage = paPages[iPage].idPage;
2844	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2845	if (RT_LIKELY(pPage))
2846	{
2847	Assert(GMM_PAGE_IS_PRIVATE(pPage));
2848	Assert(pPage->Private.hGVM == pGVM->hSelf);
2849	gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
2850	}
2851	else
2852	AssertMsgFailed(("idPage=%#x\n", idPage));
2853
2854	paPages[iPage].idPage = NIL_GMM_PAGEID;
2855	paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2856	paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2857	}
2858
2859	/* Free empty chunks. */
2860	/** @todo */
2861
2862	/* return the fail status on failure */
2863	return rc;
2864	}
2865	return VINF_SUCCESS;
2866	}
2867
2868
2869	/**
2870	* Updates the previous allocations and allocates more pages.
2871	*
2872	* The handy pages are always taken from the 'base' memory account.
2873	* The allocated pages are not cleared and will contains random garbage.
2874	*
2875	* @returns VBox status code:
2876	* @retval VINF_SUCCESS on success.
2877	* @retval VERR_NOT_OWNER if the caller is not an EMT.
2878	* @retval VERR_GMM_PAGE_NOT_FOUND if one of the pages to update wasn't found.
2879	* @retval VERR_GMM_PAGE_NOT_PRIVATE if one of the pages to update wasn't a
2880	* private page.
2881	* @retval VERR_GMM_PAGE_NOT_SHARED if one of the pages to update wasn't a
2882	* shared page.
2883	* @retval VERR_GMM_NOT_PAGE_OWNER if one of the pages to be updated wasn't
2884	* owned by the VM.
2885	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2886	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2887	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2888	* that is we're trying to allocate more than we've reserved.
2889	*
2890	* @param pGVM The global (ring-0) VM structure.
2891	* @param idCpu The VCPU id.
2892	* @param cPagesToUpdate The number of pages to update (starting from the head).
2893	* @param cPagesToAlloc The number of pages to allocate (starting from the head).
2894	* @param paPages The array of page descriptors.
2895	* See GMMPAGEDESC for details on what is expected on input.
2896	* @thread EMT(idCpu)
2897	*/
2898	GMMR0DECL(int) GMMR0AllocateHandyPages(PGVM pGVM, VMCPUID idCpu, uint32_t cPagesToUpdate,
2899	uint32_t cPagesToAlloc, PGMMPAGEDESC paPages)
2900	{
2901	LogFlow(("GMMR0AllocateHandyPages: pGVM=%p cPagesToUpdate=%#x cPagesToAlloc=%#x paPages=%p\n",
2902	pGVM, cPagesToUpdate, cPagesToAlloc, paPages));
2903
2904	/*
2905	* Validate, get basics and take the semaphore.
2906	* (This is a relatively busy path, so make predictions where possible.)
2907	*/
2908	PGMM pGMM;
2909	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2910	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
2911	if (RT_FAILURE(rc))
2912	return rc;
2913
2914	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2915	AssertMsgReturn( (cPagesToUpdate && cPagesToUpdate < 1024)
2916	\|\| (cPagesToAlloc && cPagesToAlloc < 1024),
2917	("cPagesToUpdate=%#x cPagesToAlloc=%#x\n", cPagesToUpdate, cPagesToAlloc),
2918	VERR_INVALID_PARAMETER);
2919
2920	unsigned iPage = 0;
2921	for (; iPage < cPagesToUpdate; iPage++)
2922	{
2923	AssertMsgReturn( ( paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2924	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK))
2925	\|\| paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2926	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE,
2927	("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys),
2928	VERR_INVALID_PARAMETER);
2929	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2930	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
2931	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2932	AssertMsgReturn( paPages[iPage].idSharedPage == NIL_GMM_PAGEID
2933	\|\| paPages[iPage].idSharedPage <= GMM_PAGEID_LAST,
2934	("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2935	}
2936
2937	for (; iPage < cPagesToAlloc; iPage++)
2938	{
2939	AssertMsgReturn(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS, ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys), VERR_INVALID_PARAMETER);
2940	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2941	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2942	}
2943
2944	gmmR0MutexAcquire(pGMM);
2945	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2946	{
2947	/* No allocations before the initial reservation has been made! */
2948	if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
2949	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
2950	&& pGVM->gmm.s.Stats.Reserved.cShadowPages))
2951	{
2952	/*
2953	* Perform the updates.
2954	* Stop on the first error.
2955	*/
2956	for (iPage = 0; iPage < cPagesToUpdate; iPage++)
2957	{
2958	if (paPages[iPage].idPage != NIL_GMM_PAGEID)
2959	{
2960	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idPage);
2961	if (RT_LIKELY(pPage))
2962	{
2963	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2964	{
2965	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
2966	{
2967	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2968	if (RT_LIKELY(paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST))
2969	pPage->Private.pfn = paPages[iPage].HCPhysGCPhys >> PAGE_SHIFT;
2970	else if (paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE)
2971	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
2972	/* else: NIL_RTHCPHYS nothing */
2973
2974	paPages[iPage].idPage = NIL_GMM_PAGEID;
2975	paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2976	}
2977	else
2978	{
2979	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not owner! hGVM=%#x hSelf=%#x\n",
2980	iPage, paPages[iPage].idPage, pPage->Private.hGVM, pGVM->hSelf));
2981	rc = VERR_GMM_NOT_PAGE_OWNER;
2982	break;
2983	}
2984	}
2985	else
2986	{
2987	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not private! %.Rhxs (type %d)\n", iPage, paPages[iPage].idPage, sizeof(pPage), pPage, pPage->Common.u2State));
2988	rc = VERR_GMM_PAGE_NOT_PRIVATE;
2989	break;
2990	}
2991	}
2992	else
2993	{
2994	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (private)\n", iPage, paPages[iPage].idPage));
2995	rc = VERR_GMM_PAGE_NOT_FOUND;
2996	break;
2997	}
2998	}
2999
3000	if (paPages[iPage].idSharedPage == NIL_GMM_PAGEID)
3001	{ /* likely */ }
3002	else
3003	{
3004	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idSharedPage);
3005	if (RT_LIKELY(pPage))
3006	{
3007	if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3008	{
3009	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
3010	Assert(pPage->Shared.cRefs);
3011	Assert(pGVM->gmm.s.Stats.cSharedPages);
3012	Assert(pGVM->gmm.s.Stats.Allocated.cBasePages);
3013
3014	Log(("GMMR0AllocateHandyPages: free shared page %x cRefs=%d\n", paPages[iPage].idSharedPage, pPage->Shared.cRefs));
3015	pGVM->gmm.s.Stats.cSharedPages--;
3016	pGVM->gmm.s.Stats.Allocated.cBasePages--;
3017	if (!--pPage->Shared.cRefs)
3018	gmmR0FreeSharedPage(pGMM, pGVM, paPages[iPage].idSharedPage, pPage);
3019	else
3020	{
3021	Assert(pGMM->cDuplicatePages);
3022	pGMM->cDuplicatePages--;
3023	}
3024
3025	paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
3026	}
3027	else
3028	{
3029	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not shared!\n", iPage, paPages[iPage].idSharedPage));
3030	rc = VERR_GMM_PAGE_NOT_SHARED;
3031	break;
3032	}
3033	}
3034	else
3035	{
3036	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (shared)\n", iPage, paPages[iPage].idSharedPage));
3037	rc = VERR_GMM_PAGE_NOT_FOUND;
3038	break;
3039	}
3040	}
3041	} /* for each page to update */
3042
3043	if (RT_SUCCESS(rc) && cPagesToAlloc > 0)
3044	{
3045	#if defined(VBOX_STRICT) && 0 /** @todo re-test this later. Appeared to be a PGM init bug. */
3046	for (iPage = 0; iPage < cPagesToAlloc; iPage++)
3047	{
3048	Assert(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS);
3049	Assert(paPages[iPage].idPage == NIL_GMM_PAGEID);
3050	Assert(paPages[iPage].idSharedPage == NIL_GMM_PAGEID);
3051	}
3052	#endif
3053
3054	/*
3055	* Join paths with GMMR0AllocatePages for the allocation.
3056	* Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
3057	*/
3058	rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPagesToAlloc, paPages, GMMACCOUNT_BASE);
3059	}
3060	}
3061	else
3062	rc = VERR_WRONG_ORDER;
3063	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3064	}
3065	else
3066	rc = VERR_GMM_IS_NOT_SANE;
3067	gmmR0MutexRelease(pGMM);
3068	LogFlow(("GMMR0AllocateHandyPages: returns %Rrc\n", rc));
3069	return rc;
3070	}
3071
3072
3073	/**
3074	* Allocate one or more pages.
3075	*
3076	* This is typically used for ROMs and MMIO2 (VRAM) during VM creation.
3077	* The allocated pages are not cleared and will contain random garbage.
3078	*
3079	* @returns VBox status code:
3080	* @retval VINF_SUCCESS on success.
3081	* @retval VERR_NOT_OWNER if the caller is not an EMT.
3082	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
3083	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
3084	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
3085	* that is we're trying to allocate more than we've reserved.
3086	*
3087	* @param pGVM The global (ring-0) VM structure.
3088	* @param idCpu The VCPU id.
3089	* @param cPages The number of pages to allocate.
3090	* @param paPages Pointer to the page descriptors.
3091	* See GMMPAGEDESC for details on what is expected on
3092	* input.
3093	* @param enmAccount The account to charge.
3094	*
3095	* @thread EMT.
3096	*/
3097	GMMR0DECL(int) GMMR0AllocatePages(PGVM pGVM, VMCPUID idCpu, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
3098	{
3099	LogFlow(("GMMR0AllocatePages: pGVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pGVM, cPages, paPages, enmAccount));
3100
3101	/*
3102	* Validate, get basics and take the semaphore.
3103	*/
3104	PGMM pGMM;
3105	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3106	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3107	if (RT_FAILURE(rc))
3108	return rc;
3109
3110	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3111	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3112	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3113
3114	for (unsigned iPage = 0; iPage < cPages; iPage++)
3115	{
3116	AssertMsgReturn( paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
3117	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE
3118	\|\| ( enmAccount == GMMACCOUNT_BASE
3119	&& paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
3120	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK)),
3121	("#%#x: %RHp enmAccount=%d\n", iPage, paPages[iPage].HCPhysGCPhys, enmAccount),
3122	VERR_INVALID_PARAMETER);
3123	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3124	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
3125	}
3126
3127	gmmR0MutexAcquire(pGMM);
3128	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3129	{
3130
3131	/* No allocations before the initial reservation has been made! */
3132	if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
3133	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
3134	&& pGVM->gmm.s.Stats.Reserved.cShadowPages))
3135	rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPages, paPages, enmAccount);
3136	else
3137	rc = VERR_WRONG_ORDER;
3138	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3139	}
3140	else
3141	rc = VERR_GMM_IS_NOT_SANE;
3142	gmmR0MutexRelease(pGMM);
3143	LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
3144	return rc;
3145	}
3146
3147
3148	/**
3149	* VMMR0 request wrapper for GMMR0AllocatePages.
3150	*
3151	* @returns see GMMR0AllocatePages.
3152	* @param pGVM The global (ring-0) VM structure.
3153	* @param idCpu The VCPU id.
3154	* @param pReq Pointer to the request packet.
3155	*/
3156	GMMR0DECL(int) GMMR0AllocatePagesReq(PGVM pGVM, VMCPUID idCpu, PGMMALLOCATEPAGESREQ pReq)
3157	{
3158	/*
3159	* Validate input and pass it on.
3160	*/
3161	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3162	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0]),
3163	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0])),
3164	VERR_INVALID_PARAMETER);
3165	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMALLOCATEPAGESREQ, aPages[pReq->cPages]),
3166	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF_DYN(GMMALLOCATEPAGESREQ, aPages[pReq->cPages])),
3167	VERR_INVALID_PARAMETER);
3168
3169	return GMMR0AllocatePages(pGVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3170	}
3171
3172
3173	/**
3174	* Allocate a large page to represent guest RAM
3175	*
3176	* The allocated pages are not cleared and will contains random garbage.
3177	*
3178	* @returns VBox status code:
3179	* @retval VINF_SUCCESS on success.
3180	* @retval VERR_NOT_OWNER if the caller is not an EMT.
3181	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
3182	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
3183	* that is we're trying to allocate more than we've reserved.
3184	* @retval VERR_TRY_AGAIN if the host is temporarily out of large pages.
3185	* @returns see GMMR0AllocatePages.
3186	*
3187	* @param pGVM The global (ring-0) VM structure.
3188	* @param idCpu The VCPU id.
3189	* @param cbPage Large page size.
3190	* @param pIdPage Where to return the GMM page ID of the page.
3191	* @param pHCPhys Where to return the host physical address of the page.
3192	*/
3193	GMMR0DECL(int) GMMR0AllocateLargePage(PGVM pGVM, VMCPUID idCpu, uint32_t cbPage, uint32_t pIdPage, RTHCPHYS pHCPhys)
3194	{
3195	LogFlow(("GMMR0AllocateLargePage: pGVM=%p cbPage=%x\n", pGVM, cbPage));
3196
3197	AssertReturn(cbPage == GMM_CHUNK_SIZE, VERR_INVALID_PARAMETER);
3198	AssertPtrReturn(pIdPage, VERR_INVALID_PARAMETER);
3199	AssertPtrReturn(pHCPhys, VERR_INVALID_PARAMETER);
3200
3201	/*
3202	* Validate, get basics and take the semaphore.
3203	*/
3204	PGMM pGMM;
3205	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3206	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3207	if (RT_FAILURE(rc))
3208	return rc;
3209
3210	#ifdef GMM_WITH_LEGACY_MODE
3211	// /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
3212	// if (pGMM->fLegacyAllocationMode)
3213	// return VERR_NOT_SUPPORTED;
3214	#endif
3215
3216	*pHCPhys = NIL_RTHCPHYS;
3217	*pIdPage = NIL_GMM_PAGEID;
3218
3219	gmmR0MutexAcquire(pGMM);
3220	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3221	{
3222	const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3223	if (RT_UNLIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
3224	> pGVM->gmm.s.Stats.Reserved.cBasePages))
3225	{
3226	Log(("GMMR0AllocateLargePage: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
3227	pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3228	gmmR0MutexRelease(pGMM);
3229	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
3230	}
3231
3232	/*
3233	* Allocate a new large page chunk.
3234	*
3235	* Note! We leave the giant GMM lock temporarily as the allocation might
3236	* take a long time. gmmR0RegisterChunk will retake it (ugly).
3237	*/
3238	AssertCompile(GMM_CHUNK_SIZE == _2M);
3239	gmmR0MutexRelease(pGMM);
3240
3241	RTR0MEMOBJ hMemObj;
3242	rc = RTR0MemObjAllocLarge(&hMemObj, GMM_CHUNK_SIZE, GMM_CHUNK_SIZE, RTMEMOBJ_ALLOC_LARGE_F_FAST);
3243	if (RT_SUCCESS(rc))
3244	{
3245	PGMMCHUNKFREESET pSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
3246	PGMMCHUNK pChunk;
3247	rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, GMM_CHUNK_FLAGS_LARGE_PAGE, &pChunk);
3248	if (RT_SUCCESS(rc))
3249	{
3250	/*
3251	* Allocate all the pages in the chunk.
3252	*/
3253	/* Unlink the new chunk from the free list. */
3254	gmmR0UnlinkChunk(pChunk);
3255
3256	/** @todo rewrite this to skip the looping. */
3257	/* Allocate all pages. */
3258	GMMPAGEDESC PageDesc;
3259	gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3260
3261	/* Return the first page as we'll use the whole chunk as one big page. */
3262	*pIdPage = PageDesc.idPage;
3263	*pHCPhys = PageDesc.HCPhysGCPhys;
3264
3265	for (unsigned i = 1; i < cPages; i++)
3266	gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3267
3268	/* Update accounting. */
3269	pGVM->gmm.s.Stats.Allocated.cBasePages += cPages;
3270	pGVM->gmm.s.Stats.cPrivatePages += cPages;
3271	pGMM->cAllocatedPages += cPages;
3272
3273	gmmR0LinkChunk(pChunk, pSet);
3274	gmmR0MutexRelease(pGMM);
3275	LogFlow(("GMMR0AllocateLargePage: returns VINF_SUCCESS\n"));
3276	return VINF_SUCCESS;
3277	}
3278	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
3279	}
3280	}
3281	else
3282	{
3283	gmmR0MutexRelease(pGMM);
3284	rc = VERR_GMM_IS_NOT_SANE;
3285	}
3286
3287	LogFlow(("GMMR0AllocateLargePage: returns %Rrc\n", rc));
3288	return rc;
3289	}
3290
3291
3292	/**
3293	* Free a large page.
3294	*
3295	* @returns VBox status code:
3296	* @param pGVM The global (ring-0) VM structure.
3297	* @param idCpu The VCPU id.
3298	* @param idPage The large page id.
3299	*/
3300	GMMR0DECL(int) GMMR0FreeLargePage(PGVM pGVM, VMCPUID idCpu, uint32_t idPage)
3301	{
3302	LogFlow(("GMMR0FreeLargePage: pGVM=%p idPage=%x\n", pGVM, idPage));
3303
3304	/*
3305	* Validate, get basics and take the semaphore.
3306	*/
3307	PGMM pGMM;
3308	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3309	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3310	if (RT_FAILURE(rc))
3311	return rc;
3312
3313	#ifdef GMM_WITH_LEGACY_MODE
3314	// /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
3315	// if (pGMM->fLegacyAllocationMode)
3316	// return VERR_NOT_SUPPORTED;
3317	#endif
3318
3319	gmmR0MutexAcquire(pGMM);
3320	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3321	{
3322	const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3323
3324	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3325	{
3326	Log(("GMMR0FreeLargePage: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3327	gmmR0MutexRelease(pGMM);
3328	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3329	}
3330
3331	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3332	if (RT_LIKELY( pPage
3333	&& GMM_PAGE_IS_PRIVATE(pPage)))
3334	{
3335	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3336	Assert(pChunk);
3337	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3338	Assert(pChunk->cPrivate > 0);
3339
3340	/* Release the memory immediately. */
3341	gmmR0FreeChunk(pGMM, NULL, pChunk, false /fRelaxedSem/); /** @todo this can be relaxed too! */
3342
3343	/* Update accounting. */
3344	pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages;
3345	pGVM->gmm.s.Stats.cPrivatePages -= cPages;
3346	pGMM->cAllocatedPages -= cPages;
3347	}
3348	else
3349	rc = VERR_GMM_PAGE_NOT_FOUND;
3350	}
3351	else
3352	rc = VERR_GMM_IS_NOT_SANE;
3353
3354	gmmR0MutexRelease(pGMM);
3355	LogFlow(("GMMR0FreeLargePage: returns %Rrc\n", rc));
3356	return rc;
3357	}
3358
3359
3360	/**
3361	* VMMR0 request wrapper for GMMR0FreeLargePage.
3362	*
3363	* @returns see GMMR0FreeLargePage.
3364	* @param pGVM The global (ring-0) VM structure.
3365	* @param idCpu The VCPU id.
3366	* @param pReq Pointer to the request packet.
3367	*/
3368	GMMR0DECL(int) GMMR0FreeLargePageReq(PGVM pGVM, VMCPUID idCpu, PGMMFREELARGEPAGEREQ pReq)
3369	{
3370	/*
3371	* Validate input and pass it on.
3372	*/
3373	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3374	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMFREEPAGESREQ),
3375	("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(GMMFREEPAGESREQ)),
3376	VERR_INVALID_PARAMETER);
3377
3378	return GMMR0FreeLargePage(pGVM, idCpu, pReq->idPage);
3379	}
3380
3381
3382	/**
3383	* @callback_method_impl{FNGVMMR0ENUMCALLBACK,
3384	* Used by gmmR0FreeChunkFlushPerVmTlbs().}
3385	*/
3386	static DECLCALLBACK(int) gmmR0InvalidatePerVmChunkTlbCallback(PGVM pGVM, void *pvUser)
3387	{
3388	RT_NOREF(pvUser);
3389	if (pGVM->gmm.s.hChunkTlbSpinLock != NIL_RTSPINLOCK)
3390	{
3391	RTSpinlockAcquire(pGVM->gmm.s.hChunkTlbSpinLock);
3392	uintptr_t i = RT_ELEMENTS(pGVM->gmm.s.aChunkTlbEntries);
3393	while (i-- > 0)
3394	{
3395	pGVM->gmm.s.aChunkTlbEntries[i].idGeneration = UINT64_MAX;
3396	pGVM->gmm.s.aChunkTlbEntries[i].pChunk = NULL;
3397	}
3398	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
3399	}
3400	return VINF_SUCCESS;
3401	}
3402
3403
3404	/**
3405	* Called by gmmR0FreeChunk when we reach the threshold for wrapping around the
3406	* free generation ID value.
3407	*
3408	* This is done at 2^62 - 1, which allows us to drop all locks and as it will
3409	* take a while before 12 exa (2 305 843 009 213 693 952) calls to
3410	* gmmR0FreeChunk can be made and causes a real wrap-around. We do two
3411	* invalidation passes and resets the generation ID between then. This will
3412	* make sure there are no false positives.
3413	*
3414	* @param pGMM Pointer to the GMM instance.
3415	*/
3416	static void gmmR0FreeChunkFlushPerVmTlbs(PGMM pGMM)
3417	{
3418	/*
3419	* First invalidation pass.
3420	*/
3421	int rc = GVMMR0EnumVMs(gmmR0InvalidatePerVmChunkTlbCallback, NULL);
3422	AssertRCSuccess(rc);
3423
3424	/*
3425	* Reset the generation number.
3426	*/
3427	RTSpinlockAcquire(pGMM->hSpinLockTree);
3428	ASMAtomicWriteU64(&pGMM->idFreeGeneration, 1);
3429	RTSpinlockRelease(pGMM->hSpinLockTree);
3430
3431	/*
3432	* Second invalidation pass.
3433	*/
3434	rc = GVMMR0EnumVMs(gmmR0InvalidatePerVmChunkTlbCallback, NULL);
3435	AssertRCSuccess(rc);
3436	}
3437
3438
3439	/**
3440	* Frees a chunk, giving it back to the host OS.
3441	*
3442	* @param pGMM Pointer to the GMM instance.
3443	* @param pGVM This is set when called from GMMR0CleanupVM so we can
3444	* unmap and free the chunk in one go.
3445	* @param pChunk The chunk to free.
3446	* @param fRelaxedSem Whether we can release the semaphore while doing the
3447	* freeing (@c true) or not.
3448	*/
3449	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3450	{
3451	Assert(pChunk->Core.Key != NIL_GMM_CHUNKID);
3452
3453	GMMR0CHUNKMTXSTATE MtxState;
3454	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
3455
3456	/*
3457	* Cleanup hack! Unmap the chunk from the callers address space.
3458	* This shouldn't happen, so screw lock contention...
3459	*/
3460	if ( pChunk->cMappingsX
3461	#ifdef GMM_WITH_LEGACY_MODE
3462	&& (!pGMM->fLegacyAllocationMode \|\| (pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
3463	#endif
3464	&& pGVM)
3465	gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3466
3467	/*
3468	* If there are current mappings of the chunk, then request the
3469	* VMs to unmap them. Reposition the chunk in the free list so
3470	* it won't be a likely candidate for allocations.
3471	*/
3472	if (pChunk->cMappingsX)
3473	{
3474	/** @todo R0 -> VM request */
3475	/* The chunk can be mapped by more than one VM if fBoundMemoryMode is false! */
3476	Log(("gmmR0FreeChunk: chunk still has %d mappings; don't free!\n", pChunk->cMappingsX));
3477	gmmR0ChunkMutexRelease(&MtxState, pChunk);
3478	return false;
3479	}
3480
3481
3482	/*
3483	* Save and trash the handle.
3484	*/
3485	RTR0MEMOBJ const hMemObj = pChunk->hMemObj;
3486	pChunk->hMemObj = NIL_RTR0MEMOBJ;
3487
3488	/*
3489	* Unlink it from everywhere.
3490	*/
3491	gmmR0UnlinkChunk(pChunk);
3492
3493	RTSpinlockAcquire(pGMM->hSpinLockTree);
3494
3495	RTListNodeRemove(&pChunk->ListNode);
3496
3497	PAVLU32NODECORE pCore = RTAvlU32Remove(&pGMM->pChunks, pChunk->Core.Key);
3498	Assert(pCore == &pChunk->Core); NOREF(pCore);
3499
3500	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(pChunk->Core.Key)];
3501	if (pTlbe->pChunk == pChunk)
3502	{
3503	pTlbe->idChunk = NIL_GMM_CHUNKID;
3504	pTlbe->pChunk = NULL;
3505	}
3506
3507	Assert(pGMM->cChunks > 0);
3508	pGMM->cChunks--;
3509
3510	uint64_t const idFreeGeneration = ASMAtomicIncU64(&pGMM->idFreeGeneration);
3511
3512	RTSpinlockRelease(pGMM->hSpinLockTree);
3513
3514	/*
3515	* Free the Chunk ID before dropping the locks and freeing the rest.
3516	*/
3517	gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
3518	pChunk->Core.Key = NIL_GMM_CHUNKID;
3519
3520	pGMM->cFreedChunks++;
3521
3522	gmmR0ChunkMutexRelease(&MtxState, NULL);
3523	if (fRelaxedSem)
3524	gmmR0MutexRelease(pGMM);
3525
3526	if (idFreeGeneration == UINT64_MAX / 4)
3527	gmmR0FreeChunkFlushPerVmTlbs(pGMM);
3528
3529	RTMemFree(pChunk->paMappingsX);
3530	pChunk->paMappingsX = NULL;
3531
3532	RTMemFree(pChunk);
3533
3534	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
3535	int rc = RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
3536	#else
3537	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
3538	#endif
3539	AssertLogRelRC(rc);
3540
3541	if (fRelaxedSem)
3542	gmmR0MutexAcquire(pGMM);
3543	return fRelaxedSem;
3544	}
3545
3546
3547	/**
3548	* Free page worker.
3549	*
3550	* The caller does all the statistic decrementing, we do all the incrementing.
3551	*
3552	* @param pGMM Pointer to the GMM instance data.
3553	* @param pGVM Pointer to the GVM instance.
3554	* @param pChunk Pointer to the chunk this page belongs to.
3555	* @param idPage The Page ID.
3556	* @param pPage Pointer to the page.
3557	*/
3558	static void gmmR0FreePageWorker(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint32_t idPage, PGMMPAGE pPage)
3559	{
3560	Log3(("F pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x\n",
3561	pPage, pPage - &pChunk->aPages[0], idPage, pPage->Common.u2State, pChunk->iFreeHead)); NOREF(idPage);
3562
3563	/*
3564	* Put the page on the free list.
3565	*/
3566	pPage->u = 0;
3567	pPage->Free.u2State = GMM_PAGE_STATE_FREE;
3568	Assert(pChunk->iFreeHead < RT_ELEMENTS(pChunk->aPages) \|\| pChunk->iFreeHead == UINT16_MAX);
3569	pPage->Free.iNext = pChunk->iFreeHead;
3570	pChunk->iFreeHead = pPage - &pChunk->aPages[0];
3571
3572	/*
3573	* Update statistics (the cShared/cPrivate stats are up to date already),
3574	* and relink the chunk if necessary.
3575	*/
3576	unsigned const cFree = pChunk->cFree;
3577	if ( !cFree
3578	\|\| gmmR0SelectFreeSetList(cFree) != gmmR0SelectFreeSetList(cFree + 1))
3579	{
3580	gmmR0UnlinkChunk(pChunk);
3581	pChunk->cFree++;
3582	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
3583	}
3584	else
3585	{
3586	pChunk->cFree = cFree + 1;
3587	pChunk->pSet->cFreePages++;
3588	}
3589
3590	/*
3591	* If the chunk becomes empty, consider giving memory back to the host OS.
3592	*
3593	* The current strategy is to try give it back if there are other chunks
3594	* in this free list, meaning if there are at least 240 free pages in this
3595	* category. Note that since there are probably mappings of the chunk,
3596	* it won't be freed up instantly, which probably screws up this logic
3597	* a bit...
3598	*/
3599	/** @todo Do this on the way out. */
3600	if (RT_LIKELY( pChunk->cFree != GMM_CHUNK_NUM_PAGES
3601	\|\| pChunk->pFreeNext == NULL
3602	\|\| pChunk->pFreePrev == NULL /** @todo this is probably misfiring, see reset... */))
3603	{ /* likely */ }
3604	#ifdef GMM_WITH_LEGACY_MODE
3605	else if (RT_LIKELY(pGMM->fLegacyAllocationMode && !(pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE)))
3606	{ /* likely */ }
3607	#endif
3608	else
3609	gmmR0FreeChunk(pGMM, NULL, pChunk, false);
3610
3611	}
3612
3613
3614	/**
3615	* Frees a shared page, the page is known to exist and be valid and such.
3616	*
3617	* @param pGMM Pointer to the GMM instance.
3618	* @param pGVM Pointer to the GVM instance.
3619	* @param idPage The page id.
3620	* @param pPage The page structure.
3621	*/
3622	DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3623	{
3624	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3625	Assert(pChunk);
3626	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3627	Assert(pChunk->cShared > 0);
3628	Assert(pGMM->cSharedPages > 0);
3629	Assert(pGMM->cAllocatedPages > 0);
3630	Assert(!pPage->Shared.cRefs);
3631
3632	pChunk->cShared--;
3633	pGMM->cAllocatedPages--;
3634	pGMM->cSharedPages--;
3635	gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3636	}
3637
3638
3639	/**
3640	* Frees a private page, the page is known to exist and be valid and such.
3641	*
3642	* @param pGMM Pointer to the GMM instance.
3643	* @param pGVM Pointer to the GVM instance.
3644	* @param idPage The page id.
3645	* @param pPage The page structure.
3646	*/
3647	DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3648	{
3649	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3650	Assert(pChunk);
3651	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3652	Assert(pChunk->cPrivate > 0);
3653	Assert(pGMM->cAllocatedPages > 0);
3654
3655	pChunk->cPrivate--;
3656	pGMM->cAllocatedPages--;
3657	gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3658	}
3659
3660
3661	/**
3662	* Common worker for GMMR0FreePages and GMMR0BalloonedPages.
3663	*
3664	* @returns VBox status code:
3665	* @retval xxx
3666	*
3667	* @param pGMM Pointer to the GMM instance data.
3668	* @param pGVM Pointer to the VM.
3669	* @param cPages The number of pages to free.
3670	* @param paPages Pointer to the page descriptors.
3671	* @param enmAccount The account this relates to.
3672	*/
3673	static int gmmR0FreePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3674	{
3675	/*
3676	* Check that the request isn't impossible wrt to the account status.
3677	*/
3678	switch (enmAccount)
3679	{
3680	case GMMACCOUNT_BASE:
3681	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3682	{
3683	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3684	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3685	}
3686	break;
3687	case GMMACCOUNT_SHADOW:
3688	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages < cPages))
3689	{
3690	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
3691	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3692	}
3693	break;
3694	case GMMACCOUNT_FIXED:
3695	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages < cPages))
3696	{
3697	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
3698	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3699	}
3700	break;
3701	default:
3702	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3703	}
3704
3705	/*
3706	* Walk the descriptors and free the pages.
3707	*
3708	* Statistics (except the account) are being updated as we go along,
3709	* unlike the alloc code. Also, stop on the first error.
3710	*/
3711	int rc = VINF_SUCCESS;
3712	uint32_t iPage;
3713	for (iPage = 0; iPage < cPages; iPage++)
3714	{
3715	uint32_t idPage = paPages[iPage].idPage;
3716	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3717	if (RT_LIKELY(pPage))
3718	{
3719	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
3720	{
3721	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
3722	{
3723	Assert(pGVM->gmm.s.Stats.cPrivatePages);
3724	pGVM->gmm.s.Stats.cPrivatePages--;
3725	gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
3726	}
3727	else
3728	{
3729	Log(("gmmR0AllocatePages: #%#x/%#x: not owner! hGVM=%#x hSelf=%#x\n", iPage, idPage,
3730	pPage->Private.hGVM, pGVM->hSelf));
3731	rc = VERR_GMM_NOT_PAGE_OWNER;
3732	break;
3733	}
3734	}
3735	else if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3736	{
3737	Assert(pGVM->gmm.s.Stats.cSharedPages);
3738	Assert(pPage->Shared.cRefs);
3739	#if defined(VBOX_WITH_PAGE_SHARING) && defined(VBOX_STRICT) && HC_ARCH_BITS == 64
3740	if (pPage->Shared.u14Checksum)
3741	{
3742	uint32_t uChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
3743	uChecksum &= UINT32_C(0x00003fff);
3744	AssertMsg(!uChecksum \|\| uChecksum == pPage->Shared.u14Checksum,
3745	("%#x vs %#x - idPage=%#x\n", uChecksum, pPage->Shared.u14Checksum, idPage));
3746	}
3747	#endif
3748	pGVM->gmm.s.Stats.cSharedPages--;
3749	if (!--pPage->Shared.cRefs)
3750	gmmR0FreeSharedPage(pGMM, pGVM, idPage, pPage);
3751	else
3752	{
3753	Assert(pGMM->cDuplicatePages);
3754	pGMM->cDuplicatePages--;
3755	}
3756	}
3757	else
3758	{
3759	Log(("gmmR0AllocatePages: #%#x/%#x: already free!\n", iPage, idPage));
3760	rc = VERR_GMM_PAGE_ALREADY_FREE;
3761	break;
3762	}
3763	}
3764	else
3765	{
3766	Log(("gmmR0AllocatePages: #%#x/%#x: not found!\n", iPage, idPage));
3767	rc = VERR_GMM_PAGE_NOT_FOUND;
3768	break;
3769	}
3770	paPages[iPage].idPage = NIL_GMM_PAGEID;
3771	}
3772
3773	/*
3774	* Update the account.
3775	*/
3776	switch (enmAccount)
3777	{
3778	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= iPage; break;
3779	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= iPage; break;
3780	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= iPage; break;
3781	default:
3782	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3783	}
3784
3785	/*
3786	* Any threshold stuff to be done here?
3787	*/
3788
3789	return rc;
3790	}
3791
3792
3793	/**
3794	* Free one or more pages.
3795	*
3796	* This is typically used at reset time or power off.
3797	*
3798	* @returns VBox status code:
3799	* @retval xxx
3800	*
3801	* @param pGVM The global (ring-0) VM structure.
3802	* @param idCpu The VCPU id.
3803	* @param cPages The number of pages to allocate.
3804	* @param paPages Pointer to the page descriptors containing the page IDs
3805	* for each page.
3806	* @param enmAccount The account this relates to.
3807	* @thread EMT.
3808	*/
3809	GMMR0DECL(int) GMMR0FreePages(PGVM pGVM, VMCPUID idCpu, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3810	{
3811	LogFlow(("GMMR0FreePages: pGVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pGVM, cPages, paPages, enmAccount));
3812
3813	/*
3814	* Validate input and get the basics.
3815	*/
3816	PGMM pGMM;
3817	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3818	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3819	if (RT_FAILURE(rc))
3820	return rc;
3821
3822	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3823	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3824	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3825
3826	for (unsigned iPage = 0; iPage < cPages; iPage++)
3827	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
3828	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
3829	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3830
3831	/*
3832	* Take the semaphore and call the worker function.
3833	*/
3834	gmmR0MutexAcquire(pGMM);
3835	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3836	{
3837	rc = gmmR0FreePages(pGMM, pGVM, cPages, paPages, enmAccount);
3838	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3839	}
3840	else
3841	rc = VERR_GMM_IS_NOT_SANE;
3842	gmmR0MutexRelease(pGMM);
3843	LogFlow(("GMMR0FreePages: returns %Rrc\n", rc));
3844	return rc;
3845	}
3846
3847
3848	/**
3849	* VMMR0 request wrapper for GMMR0FreePages.
3850	*
3851	* @returns see GMMR0FreePages.
3852	* @param pGVM The global (ring-0) VM structure.
3853	* @param idCpu The VCPU id.
3854	* @param pReq Pointer to the request packet.
3855	*/
3856	GMMR0DECL(int) GMMR0FreePagesReq(PGVM pGVM, VMCPUID idCpu, PGMMFREEPAGESREQ pReq)
3857	{
3858	/*
3859	* Validate input and pass it on.
3860	*/
3861	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3862	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0]),
3863	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0])),
3864	VERR_INVALID_PARAMETER);
3865	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMFREEPAGESREQ, aPages[pReq->cPages]),
3866	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF_DYN(GMMFREEPAGESREQ, aPages[pReq->cPages])),
3867	VERR_INVALID_PARAMETER);
3868
3869	return GMMR0FreePages(pGVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3870	}
3871
3872
3873	/**
3874	* Report back on a memory ballooning request.
3875	*
3876	* The request may or may not have been initiated by the GMM. If it was initiated
3877	* by the GMM it is important that this function is called even if no pages were
3878	* ballooned.
3879	*
3880	* @returns VBox status code:
3881	* @retval VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH
3882	* @retval VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH
3883	* @retval VERR_GMM_OVERCOMMITTED_TRY_AGAIN_IN_A_BIT - reset condition
3884	* indicating that we won't necessarily have sufficient RAM to boot
3885	* the VM again and that it should pause until this changes (we'll try
3886	* balloon some other VM). (For standard deflate we have little choice
3887	* but to hope the VM won't use the memory that was returned to it.)
3888	*
3889	* @param pGVM The global (ring-0) VM structure.
3890	* @param idCpu The VCPU id.
3891	* @param enmAction Inflate/deflate/reset.
3892	* @param cBalloonedPages The number of pages that was ballooned.
3893	*
3894	* @thread EMT(idCpu)
3895	*/
3896	GMMR0DECL(int) GMMR0BalloonedPages(PGVM pGVM, VMCPUID idCpu, GMMBALLOONACTION enmAction, uint32_t cBalloonedPages)
3897	{
3898	LogFlow(("GMMR0BalloonedPages: pGVM=%p enmAction=%d cBalloonedPages=%#x\n",
3899	pGVM, enmAction, cBalloonedPages));
3900
3901	AssertMsgReturn(cBalloonedPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cBalloonedPages), VERR_INVALID_PARAMETER);
3902
3903	/*
3904	* Validate input and get the basics.
3905	*/
3906	PGMM pGMM;
3907	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3908	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3909	if (RT_FAILURE(rc))
3910	return rc;
3911
3912	/*
3913	* Take the semaphore and do some more validations.
3914	*/
3915	gmmR0MutexAcquire(pGMM);
3916	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3917	{
3918	switch (enmAction)
3919	{
3920	case GMMBALLOONACTION_INFLATE:
3921	{
3922	if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cBalloonedPages
3923	<= pGVM->gmm.s.Stats.Reserved.cBasePages))
3924	{
3925	/*
3926	* Record the ballooned memory.
3927	*/
3928	pGMM->cBalloonedPages += cBalloonedPages;
3929	if (pGVM->gmm.s.Stats.cReqBalloonedPages)
3930	{
3931	/* Codepath never taken. Might be interesting in the future to request ballooned memory from guests in low memory conditions.. */
3932	AssertFailed();
3933
3934	pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3935	pGVM->gmm.s.Stats.cReqActuallyBalloonedPages += cBalloonedPages;
3936	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx Req=%#llx Actual=%#llx (pending)\n",
3937	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages,
3938	pGVM->gmm.s.Stats.cReqBalloonedPages, pGVM->gmm.s.Stats.cReqActuallyBalloonedPages));
3939	}
3940	else
3941	{
3942	pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3943	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3944	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3945	}
3946	}
3947	else
3948	{
3949	Log(("GMMR0BalloonedPages: cBasePages=%#llx Total=%#llx cBalloonedPages=%#llx Reserved=%#llx\n",
3950	pGVM->gmm.s.Stats.Allocated.cBasePages, pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages,
3951	pGVM->gmm.s.Stats.Reserved.cBasePages));
3952	rc = VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3953	}
3954	break;
3955	}
3956
3957	case GMMBALLOONACTION_DEFLATE:
3958	{
3959	/* Deflate. */
3960	if (pGVM->gmm.s.Stats.cBalloonedPages >= cBalloonedPages)
3961	{
3962	/*
3963	* Record the ballooned memory.
3964	*/
3965	Assert(pGMM->cBalloonedPages >= cBalloonedPages);
3966	pGMM->cBalloonedPages -= cBalloonedPages;
3967	pGVM->gmm.s.Stats.cBalloonedPages -= cBalloonedPages;
3968	if (pGVM->gmm.s.Stats.cReqDeflatePages)
3969	{
3970	AssertFailed(); /* This is path is for later. */
3971	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx Req=%#llx\n",
3972	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages, pGVM->gmm.s.Stats.cReqDeflatePages));
3973
3974	/*
3975	* Anything we need to do here now when the request has been completed?
3976	*/
3977	pGVM->gmm.s.Stats.cReqDeflatePages = 0;
3978	}
3979	else
3980	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3981	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3982	}
3983	else
3984	{
3985	Log(("GMMR0BalloonedPages: Total=%#llx cBalloonedPages=%#llx\n", pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages));
3986	rc = VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH;
3987	}
3988	break;
3989	}
3990
3991	case GMMBALLOONACTION_RESET:
3992	{
3993	/* Reset to an empty balloon. */
3994	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
3995
3996	pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
3997	pGVM->gmm.s.Stats.cBalloonedPages = 0;
3998	break;
3999	}
4000
4001	default:
4002	rc = VERR_INVALID_PARAMETER;
4003	break;
4004	}
4005	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4006	}
4007	else
4008	rc = VERR_GMM_IS_NOT_SANE;
4009
4010	gmmR0MutexRelease(pGMM);
4011	LogFlow(("GMMR0BalloonedPages: returns %Rrc\n", rc));
4012	return rc;
4013	}
4014
4015
4016	/**
4017	* VMMR0 request wrapper for GMMR0BalloonedPages.
4018	*
4019	* @returns see GMMR0BalloonedPages.
4020	* @param pGVM The global (ring-0) VM structure.
4021	* @param idCpu The VCPU id.
4022	* @param pReq Pointer to the request packet.
4023	*/
4024	GMMR0DECL(int) GMMR0BalloonedPagesReq(PGVM pGVM, VMCPUID idCpu, PGMMBALLOONEDPAGESREQ pReq)
4025	{
4026	/*
4027	* Validate input and pass it on.
4028	*/
4029	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4030	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMBALLOONEDPAGESREQ),
4031	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMBALLOONEDPAGESREQ)),
4032	VERR_INVALID_PARAMETER);
4033
4034	return GMMR0BalloonedPages(pGVM, idCpu, pReq->enmAction, pReq->cBalloonedPages);
4035	}
4036
4037
4038	/**
4039	* Return memory statistics for the hypervisor
4040	*
4041	* @returns VBox status code.
4042	* @param pReq Pointer to the request packet.
4043	*/
4044	GMMR0DECL(int) GMMR0QueryHypervisorMemoryStatsReq(PGMMMEMSTATSREQ pReq)
4045	{
4046	/*
4047	* Validate input and pass it on.
4048	*/
4049	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4050	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
4051	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
4052	VERR_INVALID_PARAMETER);
4053
4054	/*
4055	* Validate input and get the basics.
4056	*/
4057	PGMM pGMM;
4058	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4059	pReq->cAllocPages = pGMM->cAllocatedPages;
4060	pReq->cFreePages = (pGMM->cChunks << (GMM_CHUNK_SHIFT- PAGE_SHIFT)) - pGMM->cAllocatedPages;
4061	pReq->cBalloonedPages = pGMM->cBalloonedPages;
4062	pReq->cMaxPages = pGMM->cMaxPages;
4063	pReq->cSharedPages = pGMM->cDuplicatePages;
4064	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4065
4066	return VINF_SUCCESS;
4067	}
4068
4069
4070	/**
4071	* Return memory statistics for the VM
4072	*
4073	* @returns VBox status code.
4074	* @param pGVM The global (ring-0) VM structure.
4075	* @param idCpu Cpu id.
4076	* @param pReq Pointer to the request packet.
4077	*
4078	* @thread EMT(idCpu)
4079	*/
4080	GMMR0DECL(int) GMMR0QueryMemoryStatsReq(PGVM pGVM, VMCPUID idCpu, PGMMMEMSTATSREQ pReq)
4081	{
4082	/*
4083	* Validate input and pass it on.
4084	*/
4085	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4086	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
4087	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
4088	VERR_INVALID_PARAMETER);
4089
4090	/*
4091	* Validate input and get the basics.
4092	*/
4093	PGMM pGMM;
4094	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4095	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4096	if (RT_FAILURE(rc))
4097	return rc;
4098
4099	/*
4100	* Take the semaphore and do some more validations.
4101	*/
4102	gmmR0MutexAcquire(pGMM);
4103	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4104	{
4105	pReq->cAllocPages = pGVM->gmm.s.Stats.Allocated.cBasePages;
4106	pReq->cBalloonedPages = pGVM->gmm.s.Stats.cBalloonedPages;
4107	pReq->cMaxPages = pGVM->gmm.s.Stats.Reserved.cBasePages;
4108	pReq->cFreePages = pReq->cMaxPages - pReq->cAllocPages;
4109	}
4110	else
4111	rc = VERR_GMM_IS_NOT_SANE;
4112
4113	gmmR0MutexRelease(pGMM);
4114	LogFlow(("GMMR3QueryVMMemoryStats: returns %Rrc\n", rc));
4115	return rc;
4116	}
4117
4118
4119	/**
4120	* Worker for gmmR0UnmapChunk and gmmr0FreeChunk.
4121	*
4122	* Don't call this in legacy allocation mode!
4123	*
4124	* @returns VBox status code.
4125	* @param pGMM Pointer to the GMM instance data.
4126	* @param pGVM Pointer to the Global VM structure.
4127	* @param pChunk Pointer to the chunk to be unmapped.
4128	*/
4129	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
4130	{
4131	RT_NOREF_PV(pGMM);
4132	#ifdef GMM_WITH_LEGACY_MODE
4133	Assert(!pGMM->fLegacyAllocationMode \|\| (pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE));
4134	#endif
4135
4136	/*
4137	* Find the mapping and try unmapping it.
4138	*/
4139	uint32_t cMappings = pChunk->cMappingsX;
4140	for (uint32_t i = 0; i < cMappings; i++)
4141	{
4142	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4143	if (pChunk->paMappingsX[i].pGVM == pGVM)
4144	{
4145	/* unmap */
4146	int rc = RTR0MemObjFree(pChunk->paMappingsX[i].hMapObj, false /* fFreeMappings (NA) */);
4147	if (RT_SUCCESS(rc))
4148	{
4149	/* update the record. */
4150	cMappings--;
4151	if (i < cMappings)
4152	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
4153	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
4154	pChunk->paMappingsX[cMappings].pGVM = NULL;
4155	Assert(pChunk->cMappingsX - 1U == cMappings);
4156	pChunk->cMappingsX = cMappings;
4157	}
4158
4159	return rc;
4160	}
4161	}
4162
4163	Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
4164	return VERR_GMM_CHUNK_NOT_MAPPED;
4165	}
4166
4167
4168	/**
4169	* Unmaps a chunk previously mapped into the address space of the current process.
4170	*
4171	* @returns VBox status code.
4172	* @param pGMM Pointer to the GMM instance data.
4173	* @param pGVM Pointer to the Global VM structure.
4174	* @param pChunk Pointer to the chunk to be unmapped.
4175	* @param fRelaxedSem Whether we can release the semaphore while doing the
4176	* mapping (@c true) or not.
4177	*/
4178	static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
4179	{
4180	#ifdef GMM_WITH_LEGACY_MODE
4181	if (!pGMM->fLegacyAllocationMode \|\| (pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
4182	{
4183	#endif
4184	/*
4185	* Lock the chunk and if possible leave the giant GMM lock.
4186	*/
4187	GMMR0CHUNKMTXSTATE MtxState;
4188	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
4189	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
4190	if (RT_SUCCESS(rc))
4191	{
4192	rc = gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
4193	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4194	}
4195	return rc;
4196	#ifdef GMM_WITH_LEGACY_MODE
4197	}
4198
4199	if (pChunk->hGVM == pGVM->hSelf)
4200	return VINF_SUCCESS;
4201
4202	Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x (legacy)\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
4203	return VERR_GMM_CHUNK_NOT_MAPPED;
4204	#endif
4205	}
4206
4207
4208	/**
4209	* Worker for gmmR0MapChunk.
4210	*
4211	* @returns VBox status code.
4212	* @param pGMM Pointer to the GMM instance data.
4213	* @param pGVM Pointer to the Global VM structure.
4214	* @param pChunk Pointer to the chunk to be mapped.
4215	* @param ppvR3 Where to store the ring-3 address of the mapping.
4216	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
4217	* contain the address of the existing mapping.
4218	*/
4219	static int gmmR0MapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4220	{
4221	#ifdef GMM_WITH_LEGACY_MODE
4222	/*
4223	* If we're in legacy mode this is simple.
4224	*/
4225	if (pGMM->fLegacyAllocationMode && !(pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
4226	{
4227	if (pChunk->hGVM != pGVM->hSelf)
4228	{
4229	Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
4230	return VERR_GMM_CHUNK_NOT_FOUND;
4231	}
4232
4233	*ppvR3 = RTR0MemObjAddressR3(pChunk->hMemObj);
4234	return VINF_SUCCESS;
4235	}
4236	#else
4237	RT_NOREF(pGMM);
4238	#endif
4239
4240	/*
4241	* Check to see if the chunk is already mapped.
4242	*/
4243	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4244	{
4245	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4246	if (pChunk->paMappingsX[i].pGVM == pGVM)
4247	{
4248	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4249	Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
4250	#ifdef VBOX_WITH_PAGE_SHARING
4251	/* The ring-3 chunk cache can be out of sync; don't fail. */
4252	return VINF_SUCCESS;
4253	#else
4254	return VERR_GMM_CHUNK_ALREADY_MAPPED;
4255	#endif
4256	}
4257	}
4258
4259	/*
4260	* Do the mapping.
4261	*/
4262	RTR0MEMOBJ hMapObj;
4263	int rc = RTR0MemObjMapUser(&hMapObj, pChunk->hMemObj, (RTR3PTR)-1, 0, RTMEM_PROT_READ \| RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4264	if (RT_SUCCESS(rc))
4265	{
4266	/* reallocate the array? assumes few users per chunk (usually one). */
4267	unsigned iMapping = pChunk->cMappingsX;
4268	if ( iMapping <= 3
4269	\|\| (iMapping & 3) == 0)
4270	{
4271	unsigned cNewSize = iMapping <= 3
4272	? iMapping + 1
4273	: iMapping + 4;
4274	Assert(cNewSize < 4 \|\| RT_ALIGN_32(cNewSize, 4) == cNewSize);
4275	if (RT_UNLIKELY(cNewSize > UINT16_MAX))
4276	{
4277	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4278	return VERR_GMM_TOO_MANY_CHUNK_MAPPINGS;
4279	}
4280
4281	void pvMappings = RTMemRealloc(pChunk->paMappingsX, cNewSize sizeof(pChunk->paMappingsX[0]));
4282	if (RT_UNLIKELY(!pvMappings))
4283	{
4284	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4285	return VERR_NO_MEMORY;
4286	}
4287	pChunk->paMappingsX = (PGMMCHUNKMAP)pvMappings;
4288	}
4289
4290	/* insert new entry */
4291	pChunk->paMappingsX[iMapping].hMapObj = hMapObj;
4292	pChunk->paMappingsX[iMapping].pGVM = pGVM;
4293	Assert(pChunk->cMappingsX == iMapping);
4294	pChunk->cMappingsX = iMapping + 1;
4295
4296	*ppvR3 = RTR0MemObjAddressR3(hMapObj);
4297	}
4298
4299	return rc;
4300	}
4301
4302
4303	/**
4304	* Maps a chunk into the user address space of the current process.
4305	*
4306	* @returns VBox status code.
4307	* @param pGMM Pointer to the GMM instance data.
4308	* @param pGVM Pointer to the Global VM structure.
4309	* @param pChunk Pointer to the chunk to be mapped.
4310	* @param fRelaxedSem Whether we can release the semaphore while doing the
4311	* mapping (@c true) or not.
4312	* @param ppvR3 Where to store the ring-3 address of the mapping.
4313	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
4314	* contain the address of the existing mapping.
4315	*/
4316	static int gmmR0MapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem, PRTR3PTR ppvR3)
4317	{
4318	/*
4319	* Take the chunk lock and leave the giant GMM lock when possible, then
4320	* call the worker function.
4321	*/
4322	GMMR0CHUNKMTXSTATE MtxState;
4323	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
4324	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
4325	if (RT_SUCCESS(rc))
4326	{
4327	rc = gmmR0MapChunkLocked(pGMM, pGVM, pChunk, ppvR3);
4328	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4329	}
4330
4331	return rc;
4332	}
4333
4334
4335
4336	#if defined(VBOX_WITH_PAGE_SHARING) \|\| (defined(VBOX_STRICT) && HC_ARCH_BITS == 64)
4337	/**
4338	* Check if a chunk is mapped into the specified VM
4339	*
4340	* @returns mapped yes/no
4341	* @param pGMM Pointer to the GMM instance.
4342	* @param pGVM Pointer to the Global VM structure.
4343	* @param pChunk Pointer to the chunk to be mapped.
4344	* @param ppvR3 Where to store the ring-3 address of the mapping.
4345	*/
4346	static bool gmmR0IsChunkMapped(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4347	{
4348	GMMR0CHUNKMTXSTATE MtxState;
4349	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
4350	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4351	{
4352	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4353	if (pChunk->paMappingsX[i].pGVM == pGVM)
4354	{
4355	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4356	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4357	return true;
4358	}
4359	}
4360	*ppvR3 = NULL;
4361	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4362	return false;
4363	}
4364	#endif /* VBOX_WITH_PAGE_SHARING \|\| (VBOX_STRICT && 64-BIT) */
4365
4366
4367	/**
4368	* Map a chunk and/or unmap another chunk.
4369	*
4370	* The mapping and unmapping applies to the current process.
4371	*
4372	* This API does two things because it saves a kernel call per mapping when
4373	* when the ring-3 mapping cache is full.
4374	*
4375	* @returns VBox status code.
4376	* @param pGVM The global (ring-0) VM structure.
4377	* @param idChunkMap The chunk to map. NIL_GMM_CHUNKID if nothing to map.
4378	* @param idChunkUnmap The chunk to unmap. NIL_GMM_CHUNKID if nothing to unmap.
4379	* @param ppvR3 Where to store the address of the mapped chunk. NULL is ok if nothing to map.
4380	* @thread EMT ???
4381	*/
4382	GMMR0DECL(int) GMMR0MapUnmapChunk(PGVM pGVM, uint32_t idChunkMap, uint32_t idChunkUnmap, PRTR3PTR ppvR3)
4383	{
4384	LogFlow(("GMMR0MapUnmapChunk: pGVM=%p idChunkMap=%#x idChunkUnmap=%#x ppvR3=%p\n",
4385	pGVM, idChunkMap, idChunkUnmap, ppvR3));
4386
4387	/*
4388	* Validate input and get the basics.
4389	*/
4390	PGMM pGMM;
4391	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4392	int rc = GVMMR0ValidateGVM(pGVM);
4393	if (RT_FAILURE(rc))
4394	return rc;
4395
4396	AssertCompile(NIL_GMM_CHUNKID == 0);
4397	AssertMsgReturn(idChunkMap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkMap), VERR_INVALID_PARAMETER);
4398	AssertMsgReturn(idChunkUnmap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkUnmap), VERR_INVALID_PARAMETER);
4399
4400	if ( idChunkMap == NIL_GMM_CHUNKID
4401	&& idChunkUnmap == NIL_GMM_CHUNKID)
4402	return VERR_INVALID_PARAMETER;
4403
4404	if (idChunkMap != NIL_GMM_CHUNKID)
4405	{
4406	AssertPtrReturn(ppvR3, VERR_INVALID_POINTER);
4407	*ppvR3 = NIL_RTR3PTR;
4408	}
4409
4410	/*
4411	* Take the semaphore and do the work.
4412	*
4413	* The unmapping is done last since it's easier to undo a mapping than
4414	* undoing an unmapping. The ring-3 mapping cache cannot not be so big
4415	* that it pushes the user virtual address space to within a chunk of
4416	* it it's limits, so, no problem here.
4417	*/
4418	gmmR0MutexAcquire(pGMM);
4419	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4420	{
4421	PGMMCHUNK pMap = NULL;
4422	if (idChunkMap != NIL_GVM_HANDLE)
4423	{
4424	pMap = gmmR0GetChunk(pGMM, idChunkMap);
4425	if (RT_LIKELY(pMap))
4426	rc = gmmR0MapChunk(pGMM, pGVM, pMap, true /fRelaxedSem/, ppvR3);
4427	else
4428	{
4429	Log(("GMMR0MapUnmapChunk: idChunkMap=%#x\n", idChunkMap));
4430	rc = VERR_GMM_CHUNK_NOT_FOUND;
4431	}
4432	}
4433	/** @todo split this operation, the bail out might (theoretcially) not be
4434	* entirely safe. */
4435
4436	if ( idChunkUnmap != NIL_GMM_CHUNKID
4437	&& RT_SUCCESS(rc))
4438	{
4439	PGMMCHUNK pUnmap = gmmR0GetChunk(pGMM, idChunkUnmap);
4440	if (RT_LIKELY(pUnmap))
4441	rc = gmmR0UnmapChunk(pGMM, pGVM, pUnmap, true /fRelaxedSem/);
4442	else
4443	{
4444	Log(("GMMR0MapUnmapChunk: idChunkUnmap=%#x\n", idChunkUnmap));
4445	rc = VERR_GMM_CHUNK_NOT_FOUND;
4446	}
4447
4448	if (RT_FAILURE(rc) && pMap)
4449	gmmR0UnmapChunk(pGMM, pGVM, pMap, false /fRelaxedSem/);
4450	}
4451
4452	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4453	}
4454	else
4455	rc = VERR_GMM_IS_NOT_SANE;
4456	gmmR0MutexRelease(pGMM);
4457
4458	LogFlow(("GMMR0MapUnmapChunk: returns %Rrc\n", rc));
4459	return rc;
4460	}
4461
4462
4463	/**
4464	* VMMR0 request wrapper for GMMR0MapUnmapChunk.
4465	*
4466	* @returns see GMMR0MapUnmapChunk.
4467	* @param pGVM The global (ring-0) VM structure.
4468	* @param pReq Pointer to the request packet.
4469	*/
4470	GMMR0DECL(int) GMMR0MapUnmapChunkReq(PGVM pGVM, PGMMMAPUNMAPCHUNKREQ pReq)
4471	{
4472	/*
4473	* Validate input and pass it on.
4474	*/
4475	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4476	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
4477
4478	return GMMR0MapUnmapChunk(pGVM, pReq->idChunkMap, pReq->idChunkUnmap, &pReq->pvR3);
4479	}
4480
4481
4482	/**
4483	* Legacy mode API for supplying pages.
4484	*
4485	* The specified user address points to a allocation chunk sized block that
4486	* will be locked down and used by the GMM when the GM asks for pages.
4487	*
4488	* @returns VBox status code.
4489	* @param pGVM The global (ring-0) VM structure.
4490	* @param idCpu The VCPU id.
4491	* @param pvR3 Pointer to the chunk size memory block to lock down.
4492	*/
4493	GMMR0DECL(int) GMMR0SeedChunk(PGVM pGVM, VMCPUID idCpu, RTR3PTR pvR3)
4494	{
4495	#ifdef GMM_WITH_LEGACY_MODE
4496	/*
4497	* Validate input and get the basics.
4498	*/
4499	PGMM pGMM;
4500	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4501	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4502	if (RT_FAILURE(rc))
4503	return rc;
4504
4505	AssertPtrReturn(pvR3, VERR_INVALID_POINTER);
4506	AssertReturn(!(PAGE_OFFSET_MASK & pvR3), VERR_INVALID_POINTER);
4507
4508	if (!pGMM->fLegacyAllocationMode)
4509	{
4510	Log(("GMMR0SeedChunk: not in legacy allocation mode!\n"));
4511	return VERR_NOT_SUPPORTED;
4512	}
4513
4514	/*
4515	* Lock the memory and add it as new chunk with our hGVM.
4516	* (The GMM locking is done inside gmmR0RegisterChunk.)
4517	*/
4518	RTR0MEMOBJ hMemObj;
4519	rc = RTR0MemObjLockUser(&hMemObj, pvR3, GMM_CHUNK_SIZE, RTMEM_PROT_READ \| RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4520	if (RT_SUCCESS(rc))
4521	{
4522	rc = gmmR0RegisterChunk(pGMM, &pGVM->gmm.s.Private, hMemObj, pGVM->hSelf, GMM_CHUNK_FLAGS_SEEDED, NULL);
4523	if (RT_SUCCESS(rc))
4524	gmmR0MutexRelease(pGMM);
4525	else
4526	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
4527	}
4528
4529	LogFlow(("GMMR0SeedChunk: rc=%d (pvR3=%p)\n", rc, pvR3));
4530	return rc;
4531	#else
4532	RT_NOREF(pGVM, idCpu, pvR3);
4533	return VERR_NOT_SUPPORTED;
4534	#endif
4535	}
4536
4537
4538	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
4539	/**
4540	* Gets the ring-0 virtual address for the given page.
4541	*
4542	* This is used by PGM when IEM and such wants to access guest RAM from ring-0.
4543	* One of the ASSUMPTIONS here is that the @a idPage is used by the VM and the
4544	* corresponding chunk will remain valid beyond the call (at least till the EMT
4545	* returns to ring-3).
4546	*
4547	* @returns VBox status code.
4548	* @param pGVM Pointer to the kernel-only VM instace data.
4549	* @param idPage The page ID.
4550	* @param ppv Where to store the address.
4551	* @thread EMT
4552	*/
4553	GMMR0DECL(int) GMMR0PageIdToVirt(PGVM pGVM, uint32_t idPage, void **ppv)
4554	{
4555	*ppv = NULL;
4556	PGMM pGMM;
4557	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4558
4559	uint32_t const idChunk = idPage >> GMM_CHUNKID_SHIFT;
4560
4561	/*
4562	* Start with the per-VM TLB.
4563	*/
4564	RTSpinlockAcquire(pGVM->gmm.s.hChunkTlbSpinLock);
4565
4566	PGMMPERVMCHUNKTLBE pTlbe = &pGVM->gmm.s.aChunkTlbEntries[GMMPERVM_CHUNKTLB_IDX(idChunk)];
4567	PGMMCHUNK pChunk = pTlbe->pChunk;
4568	if ( pChunk != NULL
4569	&& pTlbe->idGeneration == ASMAtomicUoReadU64(&pGMM->idFreeGeneration)
4570	&& pChunk->Core.Key == idChunk)
4571	pGVM->R0Stats.gmm.cChunkTlbHits++; /* hopefully this is a likely outcome */
4572	else
4573	{
4574	pGVM->R0Stats.gmm.cChunkTlbMisses++;
4575
4576	/*
4577	* Look it up in the chunk tree.
4578	*/
4579	RTSpinlockAcquire(pGMM->hSpinLockTree);
4580	pChunk = gmmR0GetChunkLocked(pGMM, idChunk);
4581	if (RT_LIKELY(pChunk))
4582	{
4583	pTlbe->idGeneration = pGMM->idFreeGeneration;
4584	RTSpinlockRelease(pGMM->hSpinLockTree);
4585	pTlbe->pChunk = pChunk;
4586	}
4587	else
4588	{
4589	RTSpinlockRelease(pGMM->hSpinLockTree);
4590	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
4591	AssertMsgFailed(("idPage=%#x\n", idPage));
4592	return VERR_GMM_PAGE_NOT_FOUND;
4593	}
4594	}
4595
4596	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
4597
4598	/*
4599	* Got a chunk, now validate the page ownership and calcuate it's address.
4600	*/
4601	const GMMPAGE * const pPage = &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
4602	if (RT_LIKELY( ( GMM_PAGE_IS_PRIVATE(pPage)
4603	&& pPage->Private.hGVM == pGVM->hSelf)
4604	\|\| GMM_PAGE_IS_SHARED(pPage)))
4605	{
4606	AssertPtr(pChunk->pbMapping);
4607	*ppv = &pChunk->pbMapping[(idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT];
4608	return VINF_SUCCESS;
4609	}
4610	AssertMsgFailed(("idPage=%#x is-private=%RTbool Private.hGVM=%u pGVM->hGVM=%u\n",
4611	idPage, GMM_PAGE_IS_PRIVATE(pPage), pPage->Private.hGVM, pGVM->hSelf));
4612	return VERR_GMM_NOT_PAGE_OWNER;
4613	}
4614	#endif /* !VBOX_WITH_LINEAR_HOST_PHYS_MEM */
4615
4616	#ifdef VBOX_WITH_PAGE_SHARING
4617
4618	# ifdef VBOX_STRICT
4619	/**
4620	* For checksumming shared pages in strict builds.
4621	*
4622	* The purpose is making sure that a page doesn't change.
4623	*
4624	* @returns Checksum, 0 on failure.
4625	* @param pGMM The GMM instance data.
4626	* @param pGVM Pointer to the kernel-only VM instace data.
4627	* @param idPage The page ID.
4628	*/
4629	static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage)
4630	{
4631	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
4632	AssertMsgReturn(pChunk, ("idPage=%#x\n", idPage), 0);
4633
4634	uint8_t *pbChunk;
4635	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4636	return 0;
4637	uint8_t const *pbPage = pbChunk + ((idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4638
4639	return RTCrc32(pbPage, PAGE_SIZE);
4640	}
4641	# endif /* VBOX_STRICT */
4642
4643
4644	/**
4645	* Calculates the module hash value.
4646	*
4647	* @returns Hash value.
4648	* @param pszModuleName The module name.
4649	* @param pszVersion The module version string.
4650	*/
4651	static uint32_t gmmR0ShModCalcHash(const char pszModuleName, const char pszVersion)
4652	{
4653	return RTStrHash1ExN(3, pszModuleName, RTSTR_MAX, "::", (size_t)2, pszVersion, RTSTR_MAX);
4654	}
4655
4656
4657	/**
4658	* Finds a global module.
4659	*
4660	* @returns Pointer to the global module on success, NULL if not found.
4661	* @param pGMM The GMM instance data.
4662	* @param uHash The hash as calculated by gmmR0ShModCalcHash.
4663	* @param cbModule The module size.
4664	* @param enmGuestOS The guest OS type.
4665	* @param cRegions The number of regions.
4666	* @param pszModuleName The module name.
4667	* @param pszVersion The module version.
4668	* @param paRegions The region descriptions.
4669	*/
4670	static PGMMSHAREDMODULE gmmR0ShModFindGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4671	uint32_t cRegions, const char pszModuleName, const char pszVersion,
4672	struct VMMDEVSHAREDREGIONDESC const *paRegions)
4673	{
4674	for (PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTAvllU32Get(&pGMM->pGlobalSharedModuleTree, uHash);
4675	pGblMod;
4676	pGblMod = (PGMMSHAREDMODULE)pGblMod->Core.pList)
4677	{
4678	if (pGblMod->cbModule != cbModule)
4679	continue;
4680	if (pGblMod->enmGuestOS != enmGuestOS)
4681	continue;
4682	if (pGblMod->cRegions != cRegions)
4683	continue;
4684	if (strcmp(pGblMod->szName, pszModuleName))
4685	continue;
4686	if (strcmp(pGblMod->szVersion, pszVersion))
4687	continue;
4688
4689	uint32_t i;
4690	for (i = 0; i < cRegions; i++)
4691	{
4692	uint32_t off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4693	if (pGblMod->aRegions[i].off != off)
4694	break;
4695
4696	uint32_t cb = RT_ALIGN_32(paRegions[i].cbRegion + off, PAGE_SIZE);
4697	if (pGblMod->aRegions[i].cb != cb)
4698	break;
4699	}
4700
4701	if (i == cRegions)
4702	return pGblMod;
4703	}
4704
4705	return NULL;
4706	}
4707
4708
4709	/**
4710	* Creates a new global module.
4711	*
4712	* @returns VBox status code.
4713	* @param pGMM The GMM instance data.
4714	* @param uHash The hash as calculated by gmmR0ShModCalcHash.
4715	* @param cbModule The module size.
4716	* @param enmGuestOS The guest OS type.
4717	* @param cRegions The number of regions.
4718	* @param pszModuleName The module name.
4719	* @param pszVersion The module version.
4720	* @param paRegions The region descriptions.
4721	* @param ppGblMod Where to return the new module on success.
4722	*/
4723	static int gmmR0ShModNewGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4724	uint32_t cRegions, const char pszModuleName, const char pszVersion,
4725	struct VMMDEVSHAREDREGIONDESC const paRegions, PGMMSHAREDMODULE ppGblMod)
4726	{
4727	Log(("gmmR0ShModNewGlobal: %s %s size %#x os %u rgn %u\n", pszModuleName, pszVersion, cbModule, enmGuestOS, cRegions));
4728	if (pGMM->cShareableModules >= GMM_MAX_SHARED_GLOBAL_MODULES)
4729	{
4730	Log(("gmmR0ShModNewGlobal: Too many modules\n"));
4731	return VERR_GMM_TOO_MANY_GLOBAL_MODULES;
4732	}
4733
4734	PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTMemAllocZ(RT_UOFFSETOF_DYN(GMMSHAREDMODULE, aRegions[cRegions]));
4735	if (!pGblMod)
4736	{
4737	Log(("gmmR0ShModNewGlobal: No memory\n"));
4738	return VERR_NO_MEMORY;
4739	}
4740
4741	pGblMod->Core.Key = uHash;
4742	pGblMod->cbModule = cbModule;
4743	pGblMod->cRegions = cRegions;
4744	pGblMod->cUsers = 1;
4745	pGblMod->enmGuestOS = enmGuestOS;
4746	strcpy(pGblMod->szName, pszModuleName);
4747	strcpy(pGblMod->szVersion, pszVersion);
4748
4749	for (uint32_t i = 0; i < cRegions; i++)
4750	{
4751	Log(("gmmR0ShModNewGlobal: rgn[%u]=%RGvLB%#x\n", i, paRegions[i].GCRegionAddr, paRegions[i].cbRegion));
4752	pGblMod->aRegions[i].off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4753	pGblMod->aRegions[i].cb = paRegions[i].cbRegion + pGblMod->aRegions[i].off;
4754	pGblMod->aRegions[i].cb = RT_ALIGN_32(pGblMod->aRegions[i].cb, PAGE_SIZE);
4755	pGblMod->aRegions[i].paidPages = NULL; /* allocated when needed. */
4756	}
4757
4758	bool fInsert = RTAvllU32Insert(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4759	Assert(fInsert); NOREF(fInsert);
4760	pGMM->cShareableModules++;
4761
4762	*ppGblMod = pGblMod;
4763	return VINF_SUCCESS;
4764	}
4765
4766
4767	/**
4768	* Deletes a global module which is no longer referenced by anyone.
4769	*
4770	* @param pGMM The GMM instance data.
4771	* @param pGblMod The module to delete.
4772	*/
4773	static void gmmR0ShModDeleteGlobal(PGMM pGMM, PGMMSHAREDMODULE pGblMod)
4774	{
4775	Assert(pGblMod->cUsers == 0);
4776	Assert(pGMM->cShareableModules > 0 && pGMM->cShareableModules <= GMM_MAX_SHARED_GLOBAL_MODULES);
4777
4778	void *pvTest = RTAvllU32RemoveNode(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4779	Assert(pvTest == pGblMod); NOREF(pvTest);
4780	pGMM->cShareableModules--;
4781
4782	uint32_t i = pGblMod->cRegions;
4783	while (i-- > 0)
4784	{
4785	if (pGblMod->aRegions[i].paidPages)
4786	{
4787	/* We don't doing anything to the pages as they are handled by the
4788	copy-on-write mechanism in PGM. */
4789	RTMemFree(pGblMod->aRegions[i].paidPages);
4790	pGblMod->aRegions[i].paidPages = NULL;
4791	}
4792	}
4793	RTMemFree(pGblMod);
4794	}
4795
4796
4797	static int gmmR0ShModNewPerVM(PGVM pGVM, RTGCPTR GCBaseAddr, uint32_t cRegions, const VMMDEVSHAREDREGIONDESC *paRegions,
4798	PGMMSHAREDMODULEPERVM *ppRecVM)
4799	{
4800	if (pGVM->gmm.s.Stats.cShareableModules >= GMM_MAX_SHARED_PER_VM_MODULES)
4801	return VERR_GMM_TOO_MANY_PER_VM_MODULES;
4802
4803	PGMMSHAREDMODULEPERVM pRecVM;
4804	pRecVM = (PGMMSHAREDMODULEPERVM)RTMemAllocZ(RT_UOFFSETOF_DYN(GMMSHAREDMODULEPERVM, aRegionsGCPtrs[cRegions]));
4805	if (!pRecVM)
4806	return VERR_NO_MEMORY;
4807
4808	pRecVM->Core.Key = GCBaseAddr;
4809	for (uint32_t i = 0; i < cRegions; i++)
4810	pRecVM->aRegionsGCPtrs[i] = paRegions[i].GCRegionAddr;
4811
4812	bool fInsert = RTAvlGCPtrInsert(&pGVM->gmm.s.pSharedModuleTree, &pRecVM->Core);
4813	Assert(fInsert); NOREF(fInsert);
4814	pGVM->gmm.s.Stats.cShareableModules++;
4815
4816	*ppRecVM = pRecVM;
4817	return VINF_SUCCESS;
4818	}
4819
4820
4821	static void gmmR0ShModDeletePerVM(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULEPERVM pRecVM, bool fRemove)
4822	{
4823	/*
4824	* Free the per-VM module.
4825	*/
4826	PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
4827	pRecVM->pGlobalModule = NULL;
4828
4829	if (fRemove)
4830	{
4831	void *pvTest = RTAvlGCPtrRemove(&pGVM->gmm.s.pSharedModuleTree, pRecVM->Core.Key);
4832	Assert(pvTest == &pRecVM->Core); NOREF(pvTest);
4833	}
4834
4835	RTMemFree(pRecVM);
4836
4837	/*
4838	* Release the global module.
4839	* (In the registration bailout case, it might not be.)
4840	*/
4841	if (pGblMod)
4842	{
4843	Assert(pGblMod->cUsers > 0);
4844	pGblMod->cUsers--;
4845	if (pGblMod->cUsers == 0)
4846	gmmR0ShModDeleteGlobal(pGMM, pGblMod);
4847	}
4848	}
4849
4850	#endif /* VBOX_WITH_PAGE_SHARING */
4851
4852	/**
4853	* Registers a new shared module for the VM.
4854	*
4855	* @returns VBox status code.
4856	* @param pGVM The global (ring-0) VM structure.
4857	* @param idCpu The VCPU id.
4858	* @param enmGuestOS The guest OS type.
4859	* @param pszModuleName The module name.
4860	* @param pszVersion The module version.
4861	* @param GCPtrModBase The module base address.
4862	* @param cbModule The module size.
4863	* @param cRegions The mumber of shared region descriptors.
4864	* @param paRegions Pointer to an array of shared region(s).
4865	* @thread EMT(idCpu)
4866	*/
4867	GMMR0DECL(int) GMMR0RegisterSharedModule(PGVM pGVM, VMCPUID idCpu, VBOXOSFAMILY enmGuestOS, char *pszModuleName,
4868	char *pszVersion, RTGCPTR GCPtrModBase, uint32_t cbModule,
4869	uint32_t cRegions, struct VMMDEVSHAREDREGIONDESC const *paRegions)
4870	{
4871	#ifdef VBOX_WITH_PAGE_SHARING
4872	/*
4873	* Validate input and get the basics.
4874	*
4875	* Note! Turns out the module size does necessarily match the size of the
4876	* regions. (iTunes on XP)
4877	*/
4878	PGMM pGMM;
4879	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4880	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4881	if (RT_FAILURE(rc))
4882	return rc;
4883
4884	if (RT_UNLIKELY(cRegions > VMMDEVSHAREDREGIONDESC_MAX))
4885	return VERR_GMM_TOO_MANY_REGIONS;
4886
4887	if (RT_UNLIKELY(cbModule == 0 \|\| cbModule > _1G))
4888	return VERR_GMM_BAD_SHARED_MODULE_SIZE;
4889
4890	uint32_t cbTotal = 0;
4891	for (uint32_t i = 0; i < cRegions; i++)
4892	{
4893	if (RT_UNLIKELY(paRegions[i].cbRegion == 0 \|\| paRegions[i].cbRegion > _1G))
4894	return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4895
4896	cbTotal += paRegions[i].cbRegion;
4897	if (RT_UNLIKELY(cbTotal > _1G))
4898	return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4899	}
4900
4901	AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
4902	if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
4903	return VERR_GMM_MODULE_NAME_TOO_LONG;
4904
4905	AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
4906	if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
4907	return VERR_GMM_MODULE_NAME_TOO_LONG;
4908
4909	uint32_t const uHash = gmmR0ShModCalcHash(pszModuleName, pszVersion);
4910	Log(("GMMR0RegisterSharedModule %s %s base %RGv size %x hash %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule, uHash));
4911
4912	/*
4913	* Take the semaphore and do some more validations.
4914	*/
4915	gmmR0MutexAcquire(pGMM);
4916	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4917	{
4918	/*
4919	* Check if this module is already locally registered and register
4920	* it if it isn't. The base address is a unique module identifier
4921	* locally.
4922	*/
4923	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
4924	bool fNewModule = pRecVM == NULL;
4925	if (fNewModule)
4926	{
4927	rc = gmmR0ShModNewPerVM(pGVM, GCPtrModBase, cRegions, paRegions, &pRecVM);
4928	if (RT_SUCCESS(rc))
4929	{
4930	/*
4931	* Find a matching global module, register a new one if needed.
4932	*/
4933	PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4934	pszModuleName, pszVersion, paRegions);
4935	if (!pGblMod)
4936	{
4937	Assert(fNewModule);
4938	rc = gmmR0ShModNewGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4939	pszModuleName, pszVersion, paRegions, &pGblMod);
4940	if (RT_SUCCESS(rc))
4941	{
4942	pRecVM->pGlobalModule = pGblMod; /* (One referenced returned by gmmR0ShModNewGlobal.) */
4943	Log(("GMMR0RegisterSharedModule: new module %s %s\n", pszModuleName, pszVersion));
4944	}
4945	else
4946	gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /fRemove/);
4947	}
4948	else
4949	{
4950	Assert(pGblMod->cUsers > 0 && pGblMod->cUsers < UINT32_MAX / 2);
4951	pGblMod->cUsers++;
4952	pRecVM->pGlobalModule = pGblMod;
4953
4954	Log(("GMMR0RegisterSharedModule: new per vm module %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4955	}
4956	}
4957	}
4958	else
4959	{
4960	/*
4961	* Attempt to re-register an existing module.
4962	*/
4963	PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4964	pszModuleName, pszVersion, paRegions);
4965	if (pRecVM->pGlobalModule == pGblMod)
4966	{
4967	Log(("GMMR0RegisterSharedModule: already registered %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4968	rc = VINF_GMM_SHARED_MODULE_ALREADY_REGISTERED;
4969	}
4970	else
4971	{
4972	/** @todo may have to unregister+register when this happens in case it's caused
4973	* by VBoxService crashing and being restarted... */
4974	Log(("GMMR0RegisterSharedModule: Address clash!\n"
4975	" incoming at %RGvLB%#x %s %s rgns %u\n"
4976	" existing at %RGvLB%#x %s %s rgns %u\n",
4977	GCPtrModBase, cbModule, pszModuleName, pszVersion, cRegions,
4978	pRecVM->Core.Key, pRecVM->pGlobalModule->cbModule, pRecVM->pGlobalModule->szName,
4979	pRecVM->pGlobalModule->szVersion, pRecVM->pGlobalModule->cRegions));
4980	rc = VERR_GMM_SHARED_MODULE_ADDRESS_CLASH;
4981	}
4982	}
4983	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4984	}
4985	else
4986	rc = VERR_GMM_IS_NOT_SANE;
4987
4988	gmmR0MutexRelease(pGMM);
4989	return rc;
4990	#else
4991
4992	NOREF(pGVM); NOREF(idCpu); NOREF(enmGuestOS); NOREF(pszModuleName); NOREF(pszVersion);
4993	NOREF(GCPtrModBase); NOREF(cbModule); NOREF(cRegions); NOREF(paRegions);
4994	return VERR_NOT_IMPLEMENTED;
4995	#endif
4996	}
4997
4998
4999	/**
5000	* VMMR0 request wrapper for GMMR0RegisterSharedModule.
5001	*
5002	* @returns see GMMR0RegisterSharedModule.
5003	* @param pGVM The global (ring-0) VM structure.
5004	* @param idCpu The VCPU id.
5005	* @param pReq Pointer to the request packet.
5006	*/
5007	GMMR0DECL(int) GMMR0RegisterSharedModuleReq(PGVM pGVM, VMCPUID idCpu, PGMMREGISTERSHAREDMODULEREQ pReq)
5008	{
5009	/*
5010	* Validate input and pass it on.
5011	*/
5012	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5013	AssertMsgReturn( pReq->Hdr.cbReq >= sizeof(*pReq)
5014	&& pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMREGISTERSHAREDMODULEREQ, aRegions[pReq->cRegions]),
5015	("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5016
5017	/* Pass back return code in the request packet to preserve informational codes. (VMMR3CallR0 chokes on them) */
5018	pReq->rc = GMMR0RegisterSharedModule(pGVM, idCpu, pReq->enmGuestOS, pReq->szName, pReq->szVersion,
5019	pReq->GCBaseAddr, pReq->cbModule, pReq->cRegions, pReq->aRegions);
5020	return VINF_SUCCESS;
5021	}
5022
5023
5024	/**
5025	* Unregisters a shared module for the VM
5026	*
5027	* @returns VBox status code.
5028	* @param pGVM The global (ring-0) VM structure.
5029	* @param idCpu The VCPU id.
5030	* @param pszModuleName The module name.
5031	* @param pszVersion The module version.
5032	* @param GCPtrModBase The module base address.
5033	* @param cbModule The module size.
5034	*/
5035	GMMR0DECL(int) GMMR0UnregisterSharedModule(PGVM pGVM, VMCPUID idCpu, char pszModuleName, char pszVersion,
5036	RTGCPTR GCPtrModBase, uint32_t cbModule)
5037	{
5038	#ifdef VBOX_WITH_PAGE_SHARING
5039	/*
5040	* Validate input and get the basics.
5041	*/
5042	PGMM pGMM;
5043	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5044	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5045	if (RT_FAILURE(rc))
5046	return rc;
5047
5048	AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
5049	AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
5050	if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
5051	return VERR_GMM_MODULE_NAME_TOO_LONG;
5052	if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
5053	return VERR_GMM_MODULE_NAME_TOO_LONG;
5054
5055	Log(("GMMR0UnregisterSharedModule %s %s base=%RGv size %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule));
5056
5057	/*
5058	* Take the semaphore and do some more validations.
5059	*/
5060	gmmR0MutexAcquire(pGMM);
5061	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5062	{
5063	/*
5064	* Locate and remove the specified module.
5065	*/
5066	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
5067	if (pRecVM)
5068	{
5069	/** @todo Do we need to do more validations here, like that the
5070	* name + version + cbModule matches? */
5071	NOREF(cbModule);
5072	Assert(pRecVM->pGlobalModule);
5073	gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /fRemove/);
5074	}
5075	else
5076	rc = VERR_GMM_SHARED_MODULE_NOT_FOUND;
5077
5078	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5079	}
5080	else
5081	rc = VERR_GMM_IS_NOT_SANE;
5082
5083	gmmR0MutexRelease(pGMM);
5084	return rc;
5085	#else
5086
5087	NOREF(pGVM); NOREF(idCpu); NOREF(pszModuleName); NOREF(pszVersion); NOREF(GCPtrModBase); NOREF(cbModule);
5088	return VERR_NOT_IMPLEMENTED;
5089	#endif
5090	}
5091
5092
5093	/**
5094	* VMMR0 request wrapper for GMMR0UnregisterSharedModule.
5095	*
5096	* @returns see GMMR0UnregisterSharedModule.
5097	* @param pGVM The global (ring-0) VM structure.
5098	* @param idCpu The VCPU id.
5099	* @param pReq Pointer to the request packet.
5100	*/
5101	GMMR0DECL(int) GMMR0UnregisterSharedModuleReq(PGVM pGVM, VMCPUID idCpu, PGMMUNREGISTERSHAREDMODULEREQ pReq)
5102	{
5103	/*
5104	* Validate input and pass it on.
5105	*/
5106	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5107	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5108
5109	return GMMR0UnregisterSharedModule(pGVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule);
5110	}
5111
5112	#ifdef VBOX_WITH_PAGE_SHARING
5113
5114	/**
5115	* Increase the use count of a shared page, the page is known to exist and be valid and such.
5116	*
5117	* @param pGMM Pointer to the GMM instance.
5118	* @param pGVM Pointer to the GVM instance.
5119	* @param pPage The page structure.
5120	*/
5121	DECLINLINE(void) gmmR0UseSharedPage(PGMM pGMM, PGVM pGVM, PGMMPAGE pPage)
5122	{
5123	Assert(pGMM->cSharedPages > 0);
5124	Assert(pGMM->cAllocatedPages > 0);
5125
5126	pGMM->cDuplicatePages++;
5127
5128	pPage->Shared.cRefs++;
5129	pGVM->gmm.s.Stats.cSharedPages++;
5130	pGVM->gmm.s.Stats.Allocated.cBasePages++;
5131	}
5132
5133
5134	/**
5135	* Converts a private page to a shared page, the page is known to exist and be valid and such.
5136	*
5137	* @param pGMM Pointer to the GMM instance.
5138	* @param pGVM Pointer to the GVM instance.
5139	* @param HCPhys Host physical address
5140	* @param idPage The Page ID
5141	* @param pPage The page structure.
5142	* @param pPageDesc Shared page descriptor
5143	*/
5144	DECLINLINE(void) gmmR0ConvertToSharedPage(PGMM pGMM, PGVM pGVM, RTHCPHYS HCPhys, uint32_t idPage, PGMMPAGE pPage,
5145	PGMMSHAREDPAGEDESC pPageDesc)
5146	{
5147	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
5148	Assert(pChunk);
5149	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
5150	Assert(GMM_PAGE_IS_PRIVATE(pPage));
5151
5152	pChunk->cPrivate--;
5153	pChunk->cShared++;
5154
5155	pGMM->cSharedPages++;
5156
5157	pGVM->gmm.s.Stats.cSharedPages++;
5158	pGVM->gmm.s.Stats.cPrivatePages--;
5159
5160	/* Modify the page structure. */
5161	pPage->Shared.pfn = (uint32_t)(uint64_t)(HCPhys >> PAGE_SHIFT);
5162	pPage->Shared.cRefs = 1;
5163	#ifdef VBOX_STRICT
5164	pPageDesc->u32StrictChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
5165	pPage->Shared.u14Checksum = pPageDesc->u32StrictChecksum;
5166	#else
5167	NOREF(pPageDesc);
5168	pPage->Shared.u14Checksum = 0;
5169	#endif
5170	pPage->Shared.u2State = GMM_PAGE_STATE_SHARED;
5171	}
5172
5173
5174	static int gmmR0SharedModuleCheckPageFirstTime(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULE pModule,
5175	unsigned idxRegion, unsigned idxPage,
5176	PGMMSHAREDPAGEDESC pPageDesc, PGMMSHAREDREGIONDESC pGlobalRegion)
5177	{
5178	NOREF(pModule);
5179
5180	/* Easy case: just change the internal page type. */
5181	PGMMPAGE pPage = gmmR0GetPage(pGMM, pPageDesc->idPage);
5182	AssertMsgReturn(pPage, ("idPage=%#x (GCPhys=%RGp HCPhys=%RHp idxRegion=%#x idxPage=%#x) #1\n",
5183	pPageDesc->idPage, pPageDesc->GCPhys, pPageDesc->HCPhys, idxRegion, idxPage),
5184	VERR_PGM_PHYS_INVALID_PAGE_ID);
5185	NOREF(idxRegion);
5186
5187	AssertMsg(pPageDesc->GCPhys == (pPage->Private.pfn << 12), ("desc %RGp gmm %RGp\n", pPageDesc->HCPhys, (pPage->Private.pfn << 12)));
5188
5189	gmmR0ConvertToSharedPage(pGMM, pGVM, pPageDesc->HCPhys, pPageDesc->idPage, pPage, pPageDesc);
5190
5191	/* Keep track of these references. */
5192	pGlobalRegion->paidPages[idxPage] = pPageDesc->idPage;
5193
5194	return VINF_SUCCESS;
5195	}
5196
5197	/**
5198	* Checks specified shared module range for changes
5199	*
5200	* Performs the following tasks:
5201	* - If a shared page is new, then it changes the GMM page type to shared and
5202	* returns it in the pPageDesc descriptor.
5203	* - If a shared page already exists, then it checks if the VM page is
5204	* identical and if so frees the VM page and returns the shared page in
5205	* pPageDesc descriptor.
5206	*
5207	* @remarks ASSUMES the caller has acquired the GMM semaphore!!
5208	*
5209	* @returns VBox status code.
5210	* @param pGVM Pointer to the GVM instance data.
5211	* @param pModule Module description
5212	* @param idxRegion Region index
5213	* @param idxPage Page index
5214	* @param pPageDesc Page descriptor
5215	*/
5216	GMMR0DECL(int) GMMR0SharedModuleCheckPage(PGVM pGVM, PGMMSHAREDMODULE pModule, uint32_t idxRegion, uint32_t idxPage,
5217	PGMMSHAREDPAGEDESC pPageDesc)
5218	{
5219	int rc;
5220	PGMM pGMM;
5221	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5222	pPageDesc->u32StrictChecksum = 0;
5223
5224	AssertMsgReturn(idxRegion < pModule->cRegions,
5225	("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
5226	VERR_INVALID_PARAMETER);
5227
5228	uint32_t const cPages = pModule->aRegions[idxRegion].cb >> PAGE_SHIFT;
5229	AssertMsgReturn(idxPage < cPages,
5230	("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
5231	VERR_INVALID_PARAMETER);
5232
5233	LogFlow(("GMMR0SharedModuleCheckRange %s base %RGv region %d idxPage %d\n", pModule->szName, pModule->Core.Key, idxRegion, idxPage));
5234
5235	/*
5236	* First time; create a page descriptor array.
5237	*/
5238	PGMMSHAREDREGIONDESC pGlobalRegion = &pModule->aRegions[idxRegion];
5239	if (!pGlobalRegion->paidPages)
5240	{
5241	Log(("Allocate page descriptor array for %d pages\n", cPages));
5242	pGlobalRegion->paidPages = (uint32_t )RTMemAlloc(cPages sizeof(pGlobalRegion->paidPages[0]));
5243	AssertReturn(pGlobalRegion->paidPages, VERR_NO_MEMORY);
5244
5245	/* Invalidate all descriptors. */
5246	uint32_t i = cPages;
5247	while (i-- > 0)
5248	pGlobalRegion->paidPages[i] = NIL_GMM_PAGEID;
5249	}
5250
5251	/*
5252	* We've seen this shared page for the first time?
5253	*/
5254	if (pGlobalRegion->paidPages[idxPage] == NIL_GMM_PAGEID)
5255	{
5256	Log(("New shared page guest %RGp host %RHp\n", pPageDesc->GCPhys, pPageDesc->HCPhys));
5257	return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
5258	}
5259
5260	/*
5261	* We've seen it before...
5262	*/
5263	Log(("Replace existing page guest %RGp host %RHp id %#x -> id %#x\n",
5264	pPageDesc->GCPhys, pPageDesc->HCPhys, pPageDesc->idPage, pGlobalRegion->paidPages[idxPage]));
5265	Assert(pPageDesc->idPage != pGlobalRegion->paidPages[idxPage]);
5266
5267	/*
5268	* Get the shared page source.
5269	*/
5270	PGMMPAGE pPage = gmmR0GetPage(pGMM, pGlobalRegion->paidPages[idxPage]);
5271	AssertMsgReturn(pPage, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #2\n", pPageDesc->idPage, idxRegion, idxPage),
5272	VERR_PGM_PHYS_INVALID_PAGE_ID);
5273
5274	if (pPage->Common.u2State != GMM_PAGE_STATE_SHARED)
5275	{
5276	/*
5277	* Page was freed at some point; invalidate this entry.
5278	*/
5279	/** @todo this isn't really bullet proof. */
5280	Log(("Old shared page was freed -> create a new one\n"));
5281	pGlobalRegion->paidPages[idxPage] = NIL_GMM_PAGEID;
5282	return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
5283	}
5284
5285	Log(("Replace existing page guest host %RHp -> %RHp\n", pPageDesc->HCPhys, ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT));
5286
5287	/*
5288	* Calculate the virtual address of the local page.
5289	*/
5290	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pPageDesc->idPage >> GMM_CHUNKID_SHIFT);
5291	AssertMsgReturn(pChunk, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #4\n", pPageDesc->idPage, idxRegion, idxPage),
5292	VERR_PGM_PHYS_INVALID_PAGE_ID);
5293
5294	uint8_t *pbChunk;
5295	AssertMsgReturn(gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk),
5296	("idPage=%#x (idxRegion=%#x idxPage=%#x) #3\n", pPageDesc->idPage, idxRegion, idxPage),
5297	VERR_PGM_PHYS_INVALID_PAGE_ID);
5298	uint8_t *pbLocalPage = pbChunk + ((pPageDesc->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5299
5300	/*
5301	* Calculate the virtual address of the shared page.
5302	*/
5303	pChunk = gmmR0GetChunk(pGMM, pGlobalRegion->paidPages[idxPage] >> GMM_CHUNKID_SHIFT);
5304	Assert(pChunk); /* can't fail as gmmR0GetPage succeeded. */
5305
5306	/*
5307	* Get the virtual address of the physical page; map the chunk into the VM
5308	* process if not already done.
5309	*/
5310	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5311	{
5312	Log(("Map chunk into process!\n"));
5313	rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
5314	AssertRCReturn(rc, rc);
5315	}
5316	uint8_t *pbSharedPage = pbChunk + ((pGlobalRegion->paidPages[idxPage] & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5317
5318	#ifdef VBOX_STRICT
5319	pPageDesc->u32StrictChecksum = RTCrc32(pbSharedPage, PAGE_SIZE);
5320	uint32_t uChecksum = pPageDesc->u32StrictChecksum & UINT32_C(0x00003fff);
5321	AssertMsg(!uChecksum \|\| uChecksum == pPage->Shared.u14Checksum \|\| !pPage->Shared.u14Checksum,
5322	("%#x vs %#x - idPage=%#x - %s %s\n", uChecksum, pPage->Shared.u14Checksum,
5323	pGlobalRegion->paidPages[idxPage], pModule->szName, pModule->szVersion));
5324	#endif
5325
5326	/** @todo write ASMMemComparePage. */
5327	if (memcmp(pbSharedPage, pbLocalPage, PAGE_SIZE))
5328	{
5329	Log(("Unexpected differences found between local and shared page; skip\n"));
5330	/* Signal to the caller that this one hasn't changed. */
5331	pPageDesc->idPage = NIL_GMM_PAGEID;
5332	return VINF_SUCCESS;
5333	}
5334
5335	/*
5336	* Free the old local page.
5337	*/
5338	GMMFREEPAGEDESC PageDesc;
5339	PageDesc.idPage = pPageDesc->idPage;
5340	rc = gmmR0FreePages(pGMM, pGVM, 1, &PageDesc, GMMACCOUNT_BASE);
5341	AssertRCReturn(rc, rc);
5342
5343	gmmR0UseSharedPage(pGMM, pGVM, pPage);
5344
5345	/*
5346	* Pass along the new physical address & page id.
5347	*/
5348	pPageDesc->HCPhys = ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT;
5349	pPageDesc->idPage = pGlobalRegion->paidPages[idxPage];
5350
5351	return VINF_SUCCESS;
5352	}
5353
5354
5355	/**
5356	* RTAvlGCPtrDestroy callback.
5357	*
5358	* @returns 0 or VERR_GMM_INSTANCE.
5359	* @param pNode The node to destroy.
5360	* @param pvArgs Pointer to an argument packet.
5361	*/
5362	static DECLCALLBACK(int) gmmR0CleanupSharedModule(PAVLGCPTRNODECORE pNode, void *pvArgs)
5363	{
5364	gmmR0ShModDeletePerVM(((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGMM,
5365	((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGVM,
5366	(PGMMSHAREDMODULEPERVM)pNode,
5367	false /fRemove/);
5368	return VINF_SUCCESS;
5369	}
5370
5371
5372	/**
5373	* Used by GMMR0CleanupVM to clean up shared modules.
5374	*
5375	* This is called without taking the GMM lock so that it can be yielded as
5376	* needed here.
5377	*
5378	* @param pGMM The GMM handle.
5379	* @param pGVM The global VM handle.
5380	*/
5381	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM)
5382	{
5383	gmmR0MutexAcquire(pGMM);
5384	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
5385
5386	GMMR0SHMODPERVMDTORARGS Args;
5387	Args.pGVM = pGVM;
5388	Args.pGMM = pGMM;
5389	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5390
5391	AssertMsg(pGVM->gmm.s.Stats.cShareableModules == 0, ("%d\n", pGVM->gmm.s.Stats.cShareableModules));
5392	pGVM->gmm.s.Stats.cShareableModules = 0;
5393
5394	gmmR0MutexRelease(pGMM);
5395	}
5396
5397	#endif /* VBOX_WITH_PAGE_SHARING */
5398
5399	/**
5400	* Removes all shared modules for the specified VM
5401	*
5402	* @returns VBox status code.
5403	* @param pGVM The global (ring-0) VM structure.
5404	* @param idCpu The VCPU id.
5405	*/
5406	GMMR0DECL(int) GMMR0ResetSharedModules(PGVM pGVM, VMCPUID idCpu)
5407	{
5408	#ifdef VBOX_WITH_PAGE_SHARING
5409	/*
5410	* Validate input and get the basics.
5411	*/
5412	PGMM pGMM;
5413	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5414	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5415	if (RT_FAILURE(rc))
5416	return rc;
5417
5418	/*
5419	* Take the semaphore and do some more validations.
5420	*/
5421	gmmR0MutexAcquire(pGMM);
5422	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5423	{
5424	Log(("GMMR0ResetSharedModules\n"));
5425	GMMR0SHMODPERVMDTORARGS Args;
5426	Args.pGVM = pGVM;
5427	Args.pGMM = pGMM;
5428	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5429	pGVM->gmm.s.Stats.cShareableModules = 0;
5430
5431	rc = VINF_SUCCESS;
5432	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5433	}
5434	else
5435	rc = VERR_GMM_IS_NOT_SANE;
5436
5437	gmmR0MutexRelease(pGMM);
5438	return rc;
5439	#else
5440	RT_NOREF(pGVM, idCpu);
5441	return VERR_NOT_IMPLEMENTED;
5442	#endif
5443	}
5444
5445	#ifdef VBOX_WITH_PAGE_SHARING
5446
5447	/**
5448	* Tree enumeration callback for checking a shared module.
5449	*/
5450	static DECLCALLBACK(int) gmmR0CheckSharedModule(PAVLGCPTRNODECORE pNode, void *pvUser)
5451	{
5452	GMMCHECKSHAREDMODULEINFO pArgs = (GMMCHECKSHAREDMODULEINFO)pvUser;
5453	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)pNode;
5454	PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
5455
5456	Log(("gmmR0CheckSharedModule: check %s %s base=%RGv size=%x\n",
5457	pGblMod->szName, pGblMod->szVersion, pGblMod->Core.Key, pGblMod->cbModule));
5458
5459	int rc = PGMR0SharedModuleCheck(pArgs->pGVM, pArgs->pGVM, pArgs->idCpu, pGblMod, pRecVM->aRegionsGCPtrs);
5460	if (RT_FAILURE(rc))
5461	return rc;
5462	return VINF_SUCCESS;
5463	}
5464
5465	#endif /* VBOX_WITH_PAGE_SHARING */
5466
5467	/**
5468	* Check all shared modules for the specified VM.
5469	*
5470	* @returns VBox status code.
5471	* @param pGVM The global (ring-0) VM structure.
5472	* @param idCpu The calling EMT number.
5473	* @thread EMT(idCpu)
5474	*/
5475	GMMR0DECL(int) GMMR0CheckSharedModules(PGVM pGVM, VMCPUID idCpu)
5476	{
5477	#ifdef VBOX_WITH_PAGE_SHARING
5478	/*
5479	* Validate input and get the basics.
5480	*/
5481	PGMM pGMM;
5482	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5483	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5484	if (RT_FAILURE(rc))
5485	return rc;
5486
5487	# ifndef DEBUG_sandervl
5488	/*
5489	* Take the semaphore and do some more validations.
5490	*/
5491	gmmR0MutexAcquire(pGMM);
5492	# endif
5493	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5494	{
5495	/*
5496	* Walk the tree, checking each module.
5497	*/
5498	Log(("GMMR0CheckSharedModules\n"));
5499
5500	GMMCHECKSHAREDMODULEINFO Args;
5501	Args.pGVM = pGVM;
5502	Args.idCpu = idCpu;
5503	rc = RTAvlGCPtrDoWithAll(&pGVM->gmm.s.pSharedModuleTree, true /* fFromLeft */, gmmR0CheckSharedModule, &Args);
5504
5505	Log(("GMMR0CheckSharedModules done (rc=%Rrc)!\n", rc));
5506	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5507	}
5508	else
5509	rc = VERR_GMM_IS_NOT_SANE;
5510
5511	# ifndef DEBUG_sandervl
5512	gmmR0MutexRelease(pGMM);
5513	# endif
5514	return rc;
5515	#else
5516	RT_NOREF(pGVM, idCpu);
5517	return VERR_NOT_IMPLEMENTED;
5518	#endif
5519	}
5520
5521	#if defined(VBOX_STRICT) && HC_ARCH_BITS == 64
5522
5523	/**
5524	* Worker for GMMR0FindDuplicatePageReq.
5525	*
5526	* @returns true if duplicate, false if not.
5527	*/
5528	static bool gmmR0FindDupPageInChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint8_t const *pbSourcePage)
5529	{
5530	bool fFoundDuplicate = false;
5531	/* Only take chunks not mapped into this VM process; not entirely correct. */
5532	uint8_t *pbChunk;
5533	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5534	{
5535	int rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
5536	if (RT_SUCCESS(rc))
5537	{
5538	/*
5539	* Look for duplicate pages
5540	*/
5541	uintptr_t iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
5542	while (iPage-- > 0)
5543	{
5544	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
5545	{
5546	uint8_t *pbDestPage = pbChunk + (iPage << PAGE_SHIFT);
5547	if (!memcmp(pbSourcePage, pbDestPage, PAGE_SIZE))
5548	{
5549	fFoundDuplicate = true;
5550	break;
5551	}
5552	}
5553	}
5554	gmmR0UnmapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/);
5555	}
5556	}
5557	return fFoundDuplicate;
5558	}
5559
5560
5561	/**
5562	* Find a duplicate of the specified page in other active VMs
5563	*
5564	* @returns VBox status code.
5565	* @param pGVM The global (ring-0) VM structure.
5566	* @param pReq Pointer to the request packet.
5567	*/
5568	GMMR0DECL(int) GMMR0FindDuplicatePageReq(PGVM pGVM, PGMMFINDDUPLICATEPAGEREQ pReq)
5569	{
5570	/*
5571	* Validate input and pass it on.
5572	*/
5573	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5574	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5575
5576	PGMM pGMM;
5577	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5578
5579	int rc = GVMMR0ValidateGVM(pGVM);
5580	if (RT_FAILURE(rc))
5581	return rc;
5582
5583	/*
5584	* Take the semaphore and do some more validations.
5585	*/
5586	rc = gmmR0MutexAcquire(pGMM);
5587	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5588	{
5589	uint8_t *pbChunk;
5590	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pReq->idPage >> GMM_CHUNKID_SHIFT);
5591	if (pChunk)
5592	{
5593	if (gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5594	{
5595	uint8_t *pbSourcePage = pbChunk + ((pReq->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5596	PGMMPAGE pPage = gmmR0GetPage(pGMM, pReq->idPage);
5597	if (pPage)
5598	{
5599	/*
5600	* Walk the chunks
5601	*/
5602	pReq->fDuplicate = false;
5603	RTListForEach(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
5604	{
5605	if (gmmR0FindDupPageInChunk(pGMM, pGVM, pChunk, pbSourcePage))
5606	{
5607	pReq->fDuplicate = true;
5608	break;
5609	}
5610	}
5611	}
5612	else
5613	{
5614	AssertFailed();
5615	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
5616	}
5617	}
5618	else
5619	AssertFailed();
5620	}
5621	else
5622	AssertFailed();
5623	}
5624	else
5625	rc = VERR_GMM_IS_NOT_SANE;
5626
5627	gmmR0MutexRelease(pGMM);
5628	return rc;
5629	}
5630
5631	#endif /* VBOX_STRICT && HC_ARCH_BITS == 64 */
5632
5633
5634	/**
5635	* Retrieves the GMM statistics visible to the caller.
5636	*
5637	* @returns VBox status code.
5638	*
5639	* @param pStats Where to put the statistics.
5640	* @param pSession The current session.
5641	* @param pGVM The GVM to obtain statistics for. Optional.
5642	*/
5643	GMMR0DECL(int) GMMR0QueryStatistics(PGMMSTATS pStats, PSUPDRVSESSION pSession, PGVM pGVM)
5644	{
5645	LogFlow(("GVMMR0QueryStatistics: pStats=%p pSession=%p pGVM=%p\n", pStats, pSession, pGVM));
5646
5647	/*
5648	* Validate input.
5649	*/
5650	AssertPtrReturn(pSession, VERR_INVALID_POINTER);
5651	AssertPtrReturn(pStats, VERR_INVALID_POINTER);
5652	pStats->cMaxPages = 0; /* (crash before taking the mutex...) */
5653
5654	PGMM pGMM;
5655	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5656
5657	/*
5658	* Validate the VM handle, if not NULL, and lock the GMM.
5659	*/
5660	int rc;
5661	if (pGVM)
5662	{
5663	rc = GVMMR0ValidateGVM(pGVM);
5664	if (RT_FAILURE(rc))
5665	return rc;
5666	}
5667
5668	rc = gmmR0MutexAcquire(pGMM);
5669	if (RT_FAILURE(rc))
5670	return rc;
5671
5672	/*
5673	* Copy out the GMM statistics.
5674	*/
5675	pStats->cMaxPages = pGMM->cMaxPages;
5676	pStats->cReservedPages = pGMM->cReservedPages;
5677	pStats->cOverCommittedPages = pGMM->cOverCommittedPages;
5678	pStats->cAllocatedPages = pGMM->cAllocatedPages;
5679	pStats->cSharedPages = pGMM->cSharedPages;
5680	pStats->cDuplicatePages = pGMM->cDuplicatePages;
5681	pStats->cLeftBehindSharedPages = pGMM->cLeftBehindSharedPages;
5682	pStats->cBalloonedPages = pGMM->cBalloonedPages;
5683	pStats->cChunks = pGMM->cChunks;
5684	pStats->cFreedChunks = pGMM->cFreedChunks;
5685	pStats->cShareableModules = pGMM->cShareableModules;
5686	pStats->idFreeGeneration = pGMM->idFreeGeneration;
5687	RT_ZERO(pStats->au64Reserved);
5688
5689	/*
5690	* Copy out the VM statistics.
5691	*/
5692	if (pGVM)
5693	pStats->VMStats = pGVM->gmm.s.Stats;
5694	else
5695	RT_ZERO(pStats->VMStats);
5696
5697	gmmR0MutexRelease(pGMM);
5698	return rc;
5699	}
5700
5701
5702	/**
5703	* VMMR0 request wrapper for GMMR0QueryStatistics.
5704	*
5705	* @returns see GMMR0QueryStatistics.
5706	* @param pGVM The global (ring-0) VM structure. Optional.
5707	* @param pReq Pointer to the request packet.
5708	*/
5709	GMMR0DECL(int) GMMR0QueryStatisticsReq(PGVM pGVM, PGMMQUERYSTATISTICSSREQ pReq)
5710	{
5711	/*
5712	* Validate input and pass it on.
5713	*/
5714	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5715	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5716
5717	return GMMR0QueryStatistics(&pReq->Stats, pReq->pSession, pGVM);
5718	}
5719
5720
5721	/**
5722	* Resets the specified GMM statistics.
5723	*
5724	* @returns VBox status code.
5725	*
5726	* @param pStats Which statistics to reset, that is, non-zero fields
5727	* indicates which to reset.
5728	* @param pSession The current session.
5729	* @param pGVM The GVM to reset statistics for. Optional.
5730	*/
5731	GMMR0DECL(int) GMMR0ResetStatistics(PCGMMSTATS pStats, PSUPDRVSESSION pSession, PGVM pGVM)
5732	{
5733	NOREF(pStats); NOREF(pSession); NOREF(pGVM);
5734	/* Currently nothing we can reset at the moment. */
5735	return VINF_SUCCESS;
5736	}
5737
5738
5739	/**
5740	* VMMR0 request wrapper for GMMR0ResetStatistics.
5741	*
5742	* @returns see GMMR0ResetStatistics.
5743	* @param pGVM The global (ring-0) VM structure. Optional.
5744	* @param pReq Pointer to the request packet.
5745	*/
5746	GMMR0DECL(int) GMMR0ResetStatisticsReq(PGVM pGVM, PGMMRESETSTATISTICSSREQ pReq)
5747	{
5748	/*
5749	* Validate input and pass it on.
5750	*/
5751	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5752	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5753
5754	return GMMR0ResetStatistics(&pReq->Stats, pReq->pSession, pGVM);
5755	}
5756

Note: See TracBrowser for help on using the repository browser.

source: vbox/trunk/src/VBox/VMM/VMMR0/GMMR0.cpp@ 91540

Download in other formats: