Appendix E Boot Memory Allocator
E.1 Initialising the Boot Memory Allocator
The functions in this section are responsible for bootstrapping the
boot memory allocator. It starts with the architecture specific function
setup_memory() (See Section B.1.1) but all architectures
cover the same basic tasks in the architecture specific function before
calling the architectur independant function init_bootmem().
E.1.1 Function: init_bootmem
Source: mm/bootmem.c
This is called by UMA architectures to initialise their boot memory allocator
structures.
304 unsigned long __init init_bootmem (unsigned long start,
unsigned long pages)
305 {
306 max_low_pfn = pages;
307 min_low_pfn = start;
308 return(init_bootmem_core(&contig_page_data, start, 0, pages));
309 }
- 304Confusingly, the pages parameter is actually the end
PFN of the memory addressable by this node, not the number of pages as the
name impies
- 306Set the max PFN addressable by this node in case the architecture
dependent code did not
- 307Set the min PFN addressable by this node in case the architecture
dependent code did not
- 308Call init_bootmem_core()(See Section E.1.3)
which does the real work of initialising the bootmem_data
E.1.2 Function: init_bootmem_node
Source: mm/bootmem.c
This is called by NUMA architectures to initialise boot memory allocator
data for a given node.
284 unsigned long __init init_bootmem_node (pg_data_t *pgdat,
unsigned long freepfn,
unsigned long startpfn,
unsigned long endpfn)
285 {
286 return(init_bootmem_core(pgdat, freepfn, startpfn, endpfn));
287 }
- 286Just call
init_bootmem_core()(See Section E.1.3) directly
E.1.3 Function: init_bootmem_core
Source: mm/bootmem.c
Initialises the appropriate struct bootmem_data_t and inserts
the node into the linked list of nodes pgdat_list.
46 static unsigned long __init init_bootmem_core (pg_data_t *pgdat,
47 unsigned long mapstart, unsigned long start, unsigned long end)
48 {
49 bootmem_data_t *bdata = pgdat->bdata;
50 unsigned long mapsize = ((end - start)+7)/8;
51
52 pgdat->node_next = pgdat_list;
53 pgdat_list = pgdat;
54
55 mapsize = (mapsize + (sizeof(long) - 1UL)) &
~(sizeof(long) - 1UL);
56 bdata->node_bootmem_map = phys_to_virt(mapstart << PAGE_SHIFT);
57 bdata->node_boot_start = (start << PAGE_SHIFT);
58 bdata->node_low_pfn = end;
59
60 /*
61 * Initially all pages are reserved - setup_arch() has to
62 * register free RAM areas explicitly.
63 */
64 memset(bdata->node_bootmem_map, 0xff, mapsize);
65
66 return mapsize;
67 }
- 46The parameters are;
-
- pgdat is the node descriptor been initialised
- mapstart is the beginning of the memory that will be usable
- start is the beginning PFN of the node
- end is the end PFN of the node
- 50Each page requires one bit to represent it so the size of the
map required is the number of pages in this node rounded up to the nearest
multiple of 8 and then divided by 8 to give the number of bytes required
- 52-53As the node will be shortly considered initialised, insert it into
the global pgdat_list
- 55Round the mapsize up to the closest word boundary
- 56Convert the mapstart to a virtual address and store it
in bdata→node_bootmem_map
- 57Convert the starting PFN to a physical address and store it on
node_boot_start
- 58Store the end PFN of ZONE_NORMAL in node_low_pfn
- 64Fill the full map with 1's marking all pages as allocated. It is up
to the architecture dependent code to mark the usable pages
E.2 Allocating Memory
E.2.1 Reserving Large Regions of Memory
E.2.1.1 Function: reserve_bootmem
Source: mm/bootmem.c
311 void __init reserve_bootmem (unsigned long addr, unsigned long size)
312 {
313 reserve_bootmem_core(contig_page_data.bdata, addr, size);
314 }
- 313Just call
reserve_bootmem_core()(See Section E.2.1.3). As this
is for a non-NUMA architecture, the node to allocate from is the static
contig_page_data node.
E.2.1.2 Function: reserve_bootmem_node
Source: mm/bootmem.c
289 void __init reserve_bootmem_node (pg_data_t *pgdat,
unsigned long physaddr,
unsigned long size)
290 {
291 reserve_bootmem_core(pgdat->bdata, physaddr, size);
292 }
- 291Just call
reserve_bootmem_core()(See Section E.2.1.3) passing
it the bootmem data of the requested node
E.2.1.3 Function: reserve_bootmem_core
Source: mm/bootmem.c
74 static void __init reserve_bootmem_core(bootmem_data_t *bdata,
unsigned long addr,
unsigned long size)
75 {
76 unsigned long i;
77 /*
78 * round up, partially reserved pages are considered
79 * fully reserved.
80 */
81 unsigned long sidx = (addr - bdata->node_boot_start)/PAGE_SIZE;
82 unsigned long eidx = (addr + size - bdata->node_boot_start +
83 PAGE_SIZE-1)/PAGE_SIZE;
84 unsigned long end = (addr + size + PAGE_SIZE-1)/PAGE_SIZE;
85
86 if (!size) BUG();
87
88 if (sidx < 0)
89 BUG();
90 if (eidx < 0)
91 BUG();
92 if (sidx >= eidx)
93 BUG();
94 if ((addr >> PAGE_SHIFT) >= bdata->node_low_pfn)
95 BUG();
96 if (end > bdata->node_low_pfn)
97 BUG();
98 for (i = sidx; i < eidx; i++)
99 if (test_and_set_bit(i, bdata->node_bootmem_map))
100 printk("hm, page %08lx reserved twice.\n",
i*PAGE_SIZE);
101 }
- 81The sidx is the starting index to serve pages from. The
value is obtained by subtracting the starting address from the requested
address and dividing by the size of a page
- 82A similar calculation is made for the ending index eidx
except that the allocation is rounded up to the nearest page. This means that
requests to partially reserve a page will result in the full page being
reserved
- 84end is the last PFN that is affected by this reservation
- 86Check that a non-zero value has been given
- 88-89Check the starting index is not before the start of the node
- 90-91Check the end index is not before the start of the node
- 92-93Check the starting index is not after the end index
- 94-95Check the starting address is not beyond the memory this bootmem
node represents
- 96-97Check the ending address is not beyond the memory this bootmem
node represents
- 88-100Starting with sidx and finishing with eidx,
test and set the bit in the bootmem map that represents the page marking it as
allocated. If the bit was already set to 1, print out a message saying it was
reserved twice
E.2.2 Allocating Memory at Boot Time
E.2.2.1 Function: alloc_bootmem
Source: mm/bootmem.c
The callgraph for these macros is shown in Figure 5.1.
38 #define alloc_bootmem(x) \
39 __alloc_bootmem((x), SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS))
40 #define alloc_bootmem_low(x) \
41 __alloc_bootmem((x), SMP_CACHE_BYTES, 0)
42 #define alloc_bootmem_pages(x) \
43 __alloc_bootmem((x), PAGE_SIZE, __pa(MAX_DMA_ADDRESS))
44 #define alloc_bootmem_low_pages(x) \
45 __alloc_bootmem((x), PAGE_SIZE, 0)
- 39alloc_bootmem() will align to the L1 hardware cache and
start searching for a page after the maximum address usable for DMA
- 40alloc_bootmem_low() will align to the L1 hardware cache
and start searching from page 0
- 42alloc_bootmem_pages() will align the allocation to a page
size so that full pages will be allocated starting from the maximum address
usable for DMA
- 44alloc_bootmem_pages() will align the allocation to a page
size so that full pages will be allocated starting from physical address 0
E.2.2.2 Function: __alloc_bootmem
Source: mm/bootmem.c
326 void * __init __alloc_bootmem (unsigned long size,
unsigned long align, unsigned long goal)
327 {
328 pg_data_t *pgdat;
329 void *ptr;
330
331 for_each_pgdat(pgdat)
332 if ((ptr = __alloc_bootmem_core(pgdat->bdata, size,
333 align, goal)))
334 return(ptr);
335
336 /*
337 * Whoops, we cannot satisfy the allocation request.
338 */
339 printk(KERN_ALERT "bootmem alloc of %lu bytes failed!\n", size);
340 panic("Out of memory");
341 return NULL;
342 }
- 326The parameters are;
-
- size is the size of the requested allocation
- align is the desired alignment and must be a power of 2. Currently
either SMP_CACHE_BYTES or PAGE_SIZE
- goal is the starting address to begin searching from
- 331-334Cycle through all available nodes and try allocating from each
in turn. In the UMA case, this will just allocate from the
contig_page_data node
- 349-340If the allocation fails, the system is not going to be able to
boot so the kernel panics
E.2.2.3 Function: alloc_bootmem_node
Source: mm/bootmem.c
53 #define alloc_bootmem_node(pgdat, x) \
54 __alloc_bootmem_node((pgdat), (x), SMP_CACHE_BYTES,
__pa(MAX_DMA_ADDRESS))
55 #define alloc_bootmem_pages_node(pgdat, x) \
56 __alloc_bootmem_node((pgdat), (x), PAGE_SIZE,
__pa(MAX_DMA_ADDRESS))
57 #define alloc_bootmem_low_pages_node(pgdat, x) \
58 __alloc_bootmem_node((pgdat), (x), PAGE_SIZE, 0)
- 53-54alloc_bootmem_node() will allocate from the requested
node and align to the L1 hardware cache and start searching for a page
beginning with ZONE_NORMAL (i.e. at the end of ZONE_DMA which is
at MAX_DMA_ADDRESS)
- 55-56alloc_bootmem_pages() will allocate from the
requested node and align the allocation to a page size so that full pages
will be allocated starting from the ZONE_NORMAL
- 57-58alloc_bootmem_pages() will allocate from the
requested node and align the allocation to a page size so that full pages
will be allocated starting from physical address 0 so that ZONE_DMA will
be used
E.2.2.4 Function: __alloc_bootmem_node
Source: mm/bootmem.c
344 void * __init __alloc_bootmem_node (pg_data_t *pgdat,
unsigned long size,
unsigned long align,
unsigned long goal)
345 {
346 void *ptr;
347
348 ptr = __alloc_bootmem_core(pgdat->bdata, size, align, goal);
349 if (ptr)
350 return (ptr);
351
352 /*
353 * Whoops, we cannot satisfy the allocation request.
354 */
355 printk(KERN_ALERT "bootmem alloc of %lu bytes failed!\n", size);
356 panic("Out of memory");
357 return NULL;
358 }
- 344The parameters are the same as for
__alloc_bootmem_node() (See Section E.2.2.4)
except the node to allocate from is specified
- 348Call the core function __alloc_bootmem_core()
(See Section E.2.2.5) to perform the allocation
- 349-350Return a pointer if it was successful
- 355-356Otherwise print out a message and panic the kernel as the system
will not boot if memory can not be allocated even now
E.2.2.5 Function: __alloc_bootmem_core
Source: mm/bootmem.c
This is the core function for allocating memory from a specified node with the
boot memory allocator. It is quite large and broken up into the following
tasks;
- Function preamble. Make sure the parameters are sane
- Calculate the starting address to scan from based on the goal
parameter
- Check to see if this allocation may be merged with the page used for the
previous allocation to save memory.
- Mark the pages allocated as 1 in the bitmap and zero out the contents of
the pages
144 static void * __init __alloc_bootmem_core (bootmem_data_t *bdata,
145 unsigned long size, unsigned long align, unsigned long goal)
146 {
147 unsigned long i, start = 0;
148 void *ret;
149 unsigned long offset, remaining_size;
150 unsigned long areasize, preferred, incr;
151 unsigned long eidx = bdata->node_low_pfn -
152 (bdata->node_boot_start >> PAGE_SHIFT);
153
154 if (!size) BUG();
155
156 if (align & (align-1))
157 BUG();
158
159 offset = 0;
160 if (align &&
161 (bdata->node_boot_start & (align - 1UL)) != 0)
162 offset = (align - (bdata->node_boot_start &
(align - 1UL)));
163 offset >>= PAGE_SHIFT;
Function preamble, make sure the parameters are sane
- 144The parameters are;
-
- bdata is the bootmem for the struct being allocated from
- size is the size of the requested allocation
- align is the desired alignment for the allocation. Must be a power of 2
- goal is the preferred address to allocate above if possible
- 151Calculate the ending bit index eidx which returns the
highest page index that may be used for the allocation
- 154Call BUG() if a request size of 0 is specified
- 156-156If the alignment is not a power of 2, call BUG()
- 159The default offset for alignments is 0
- 160If an alignment has been specified and...
- 161And the requested alignment is the same alignment as the start
of the node then calculate the offset to use
- 162The offset to use is the requested alignment masked against
the lower bits of the starting address. In reality, this offset
will likely be identical to align for the prevalent values of
align
169 if (goal && (goal >= bdata->node_boot_start) &&
170 ((goal >> PAGE_SHIFT) < bdata->node_low_pfn)) {
171 preferred = goal - bdata->node_boot_start;
172 } else
173 preferred = 0;
174
175 preferred = ((preferred + align - 1) & ~(align - 1))
>> PAGE_SHIFT;
176 preferred += offset;
177 areasize = (size+PAGE_SIZE-1)/PAGE_SIZE;
178 incr = align >> PAGE_SHIFT ? : 1;
Calculate the starting PFN to start scanning from based on the goal
parameter.
- 169If a goal has been specified and the goal is after the starting
address for this node and the PFN of the goal is less than the last PFN
adressable by this node then ....
- 170The preferred offset to start from is the goal minus the beginning
of the memory addressable by this node
- 173Else the preferred offset is 0
- 175-176Adjust the preferred address to take the offset into account
so that the address will be correctly aligned
- 177The number of pages that will be affected by this allocation is
stored in areasize
- 178incr is the number of pages that have to be skipped to
satisify alignment requirements if they are over one page
179
180 restart_scan:
181 for (i = preferred; i < eidx; i += incr) {
182 unsigned long j;
183 if (test_bit(i, bdata->node_bootmem_map))
184 continue;
185 for (j = i + 1; j < i + areasize; ++j) {
186 if (j >= eidx)
187 goto fail_block;
188 if (test_bit (j, bdata->node_bootmem_map))
189 goto fail_block;
190 }
191 start = i;
192 goto found;
193 fail_block:;
194 }
195 if (preferred) {
196 preferred = offset;
197 goto restart_scan;
198 }
199 return NULL;
Scan through memory looking for a block large enough to satisfy this request
- 180If the allocation could not be satisifed starting from
goal, this label is jumped to so that the map will be rescanned
- 181-194Starting from preferred, scan lineraly searching
for a free block large enough to satisfy the request. Walk the address space
in incr steps to satisfy alignments greater than one page. If the
alignment is less than a page, incr will just be 1
- 183-184Test the bit, if it is already 1, it is not free so move to the
next page
- 185-190Scan the next areasize number of pages and see if
they are also free. It fails if the end of the addressable space is reached
(eidx) or one of the pages is already in use
- 191-192A free block is found so record the start and jump to
the found block
- 195-198The allocation failed so start again from the beginning
- 199If that also failed, return NULL which will result in a kernel
panic
200 found:
201 if (start >= eidx)
202 BUG();
203
209 if (align <= PAGE_SIZE
210 && bdata->last_offset && bdata->last_pos+1 == start) {
211 offset = (bdata->last_offset+align-1) & ~(align-1);
212 if (offset > PAGE_SIZE)
213 BUG();
214 remaining_size = PAGE_SIZE-offset;
215 if (size < remaining_size) {
216 areasize = 0;
217 // last_pos unchanged
218 bdata->last_offset = offset+size;
219 ret = phys_to_virt(bdata->last_pos*PAGE_SIZE + offset +
220 bdata->node_boot_start);
221 } else {
222 remaining_size = size - remaining_size;
223 areasize = (remaining_size+PAGE_SIZE-1)/PAGE_SIZE;
224 ret = phys_to_virt(bdata->last_pos*PAGE_SIZE +
225 offset +
bdata->node_boot_start);
226 bdata->last_pos = start+areasize-1;
227 bdata->last_offset = remaining_size;
228 }
229 bdata->last_offset &= ~PAGE_MASK;
230 } else {
231 bdata->last_pos = start + areasize - 1;
232 bdata->last_offset = size & ~PAGE_MASK;
233 ret = phys_to_virt(start * PAGE_SIZE +
bdata->node_boot_start);
234 }
Test to see if this allocation may be merged with the previous allocation.
- 201-202Check that the start of the allocation is not after the
addressable memory. This check was just made so it is redundent
- 209-230Try and merge with the previous allocation if the alignment
is less than a PAGE_SIZE, the previously page has space in it
(last_offset != 0) and that the previously used page is adjactent
to the page found for this allocation
- 231-234Else record the pages and offset used for this allocation to be
used for merging with the next allocation
- 211Update the offset to use to be aligned correctly for the requested
align
- 212-213If the offset now goes over the edge of a page,
BUG() is called. This condition would require a very poor choice
of alignment to be used. As the only alignment commonly used is a factor of
PAGE_SIZE, it is impossible for normal usage
- 214remaining_size is the remaining free space in the
previously used page
- 215-221If there is enough space left in the old page then use the old
page totally and update the bootmem_data struct to reflect it
- 221-228Else calculate how many pages in addition to this one will be
required and update the bootmem_data
- 216The number of pages used by this allocation is now 0
- 218Update the last_offset to be the end of this allocation
- 219Calculate the virtual address to return for the successful
allocation
- 222remaining_size is how space will be used in the last page
used to satisfy the allocation
- 223Calculate how many more pages are needed to satisfy the allocation
- 224Record the address the allocation starts from
- 226The last page used is the start page plus the number of
additional pages required to satisfy this allocation areasize
- 227The end of the allocation has already been calculated
- 229If the offset is at the end of the page, make it 0
- 231No merging took place so record the last page used to satisfy this
allocation
- 232Record how much of the last page was used
- 233Record the starting virtual address of the allocation
238 for (i = start; i < start+areasize; i++)
239 if (test_and_set_bit(i, bdata->node_bootmem_map))
240 BUG();
241 memset(ret, 0, size);
242 return ret;
243 }
Mark the pages allocated as 1 in the bitmap and zero out the contents of
the pages
- 238-240Cycle through all pages used for this allocation and set the bit
to 1 in the bitmap. If any of them are already 1, then a double allocation took
place so call BUG()
- 241Zero fill the pages
- 242Return the address of the allocation
E.3 Freeing Memory
E.3.1 Function: free_bootmem
Source: mm/bootmem.c
Figure E.1: Call Graph: free_bootmem() |
294 void __init free_bootmem_node (pg_data_t *pgdat,
unsigned long physaddr, unsigned long size)
295 {
296 return(free_bootmem_core(pgdat->bdata, physaddr, size));
297 }
316 void __init free_bootmem (unsigned long addr, unsigned long size)
317 {
318 return(free_bootmem_core(contig_page_data.bdata, addr, size));
319 }
- 296Call the core function with the corresponding bootmem data for the
requested node
- 318Call the core function with the bootmem data for
contig_page_data
E.3.2 Function: free_bootmem_core
Source: mm/bootmem.c
103 static void __init free_bootmem_core(bootmem_data_t *bdata,
unsigned long addr,
unsigned long size)
104 {
105 unsigned long i;
106 unsigned long start;
111 unsigned long sidx;
112 unsigned long eidx = (addr + size -
bdata->node_boot_start)/PAGE_SIZE;
113 unsigned long end = (addr + size)/PAGE_SIZE;
114
115 if (!size) BUG();
116 if (end > bdata->node_low_pfn)
117 BUG();
118
119 /*
120 * Round up the beginning of the address.
121 */
122 start = (addr + PAGE_SIZE-1) / PAGE_SIZE;
123 sidx = start - (bdata->node_boot_start/PAGE_SIZE);
124
125 for (i = sidx; i < eidx; i++) {
126 if (!test_and_clear_bit(i, bdata->node_bootmem_map))
127 BUG();
128 }
129 }
- 112Calculate the end index affected as eidx
- 113The end address is the end of the affected area rounded down to the
nearest page if it is not already page aligned
- 115If a size of 0 is freed, call BUG
- 116-117If the end PFN is after the memory addressable by this node,
call BUG
- 122Round the starting address up to the nearest page if it is not
already page aligned
- 123Calculate the starting index to free
- 125-127For all full pages that are freed by this action, clear the bit
in the boot bitmap. If it is already 0, it is a double free or is memory that
was never used so call BUG
E.4 Retiring the Boot Memory Allocator
Once the system is started, the boot memory allocator is no longer needed
so these functions are responsible for removing unnecessary boot memory
allocator structures and passing the remaining pages to the normal physical
page allocator.
E.4.1 Function: mem_init
Source: arch/i386/mm/init.c
The call graph for this function is shown in Figure 5.2. The
important part of this function for the boot memory allocator is that it calls
free_pages_init()(See Section E.4.2). The function is
broken up into the following tasks
- Function preamble, set the PFN within the global mem_map
for the location of high memory and zero out the system wide zero page
- Call free_pages_init()(See Section E.4.2)
- Print out an informational message on the availability of memory in the
system
- Check the CPU supports PAE if the config option is enabled and test
the WP bit on the CPU. This is important as without the WP bit, the
function verify_write() has to be called for every write to
userspace from the kernel. This only applies to old processors like
the 386
- Fill in entries for the userspace portion of the PGD for
swapper_pg_dir, the kernel page tables. The zero page is
mapped for all entries
507 void __init mem_init(void)
508 {
509 int codesize, reservedpages, datasize, initsize;
510
511 if (!mem_map)
512 BUG();
513
514 set_max_mapnr_init();
515
516 high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
517
518 /* clear the zero-page */
519 memset(empty_zero_page, 0, PAGE_SIZE);
- 514This function records the PFN high memory starts in
mem_map (highmem_start_page), the maximum number of
pages in the system (max_mapnr and num_physpages)
and finally the maximum number of pages that may be mapped by the kernel
(num_mappedpages)
- 516high_memory is the virtual address where high memory
begins
- 519Zero out the system wide zero page
520
521 reservedpages = free_pages_init();
522
- 512Call free_pages_init()(See Section E.4.2)
which tells the boot memory allocator to retire itself as well as initialising
all pages in high memory for use with the buddy allocator
523 codesize = (unsigned long) &_etext - (unsigned long) &_text;
524 datasize = (unsigned long) &_edata - (unsigned long) &_etext;
525 initsize = (unsigned long) &__init_end - (unsigned long)
&__init_begin;
526
527 printk(KERN_INFO "Memory: %luk/%luk available (%dk kernel code,
%dk reserved, %dk data, %dk init, %ldk highmem)\n",
528 (unsigned long) nr_free_pages() << (PAGE_SHIFT-10),
529 max_mapnr << (PAGE_SHIFT-10),
530 codesize >> 10,
531 reservedpages << (PAGE_SHIFT-10),
532 datasize >> 10,
533 initsize >> 10,
534 (unsigned long) (totalhigh_pages << (PAGE_SHIFT-10))
535 );
Print out an informational message
- 523Calculate the size of the code segment, data segment and memory
used by initialisation code and data (all functions marked __init
will be in this section)
- 527-535Print out a nice message on how the availability of memory and
the amount of memory consumed by the kernel
536
537 #if CONFIG_X86_PAE
538 if (!cpu_has_pae)
539 panic("cannot execute a PAE-enabled kernel on a PAE-less
CPU!");
540 #endif
541 if (boot_cpu_data.wp_works_ok < 0)
542 test_wp_bit();
543
- 538-539If PAE is enabled but the processor does not support it, panic
- 541-542Test for the availability of the WP bit
550 #ifndef CONFIG_SMP
551 zap_low_mappings();
552 #endif
553
554 }
- 551Cycle through each PGD used by the userspace portion of
swapper_pg_dir and map the zero page to it
E.4.2 Function: free_pages_init
Source: arch/i386/mm/init.c
This function has two important functions, to call
free_all_bootmem() (See Section E.4.4) to retire the
boot memory allocator and to free all high memory pages to the buddy allocator.
481 static int __init free_pages_init(void)
482 {
483 extern int ppro_with_ram_bug(void);
484 int bad_ppro, reservedpages, pfn;
485
486 bad_ppro = ppro_with_ram_bug();
487
488 /* this will put all low memory onto the freelists */
489 totalram_pages += free_all_bootmem();
490
491 reservedpages = 0;
492 for (pfn = 0; pfn < max_low_pfn; pfn++) {
493 /*
494 * Only count reserved RAM pages
495 */
496 if (page_is_ram(pfn) && PageReserved(mem_map+pfn))
497 reservedpages++;
498 }
499 #ifdef CONFIG_HIGHMEM
500 for (pfn = highend_pfn-1; pfn >= highstart_pfn; pfn--)
501 one_highpage_init((struct page *) (mem_map + pfn), pfn,
bad_ppro);
502 totalram_pages += totalhigh_pages;
503 #endif
504 return reservedpages;
505 }
- 486There is a bug in the Pentium Pros that prevent certain pages
in high memory being used. The function ppro_with_ram_bug()
checks for its existance
- 489Call free_all_bootmem() to retire the boot memory
allocator
- 491-498Cycle through all of memory and count the number of reserved
pages that were left over by the boot memory allocator
- 500-501For each page in high memory, call
one_highpage_init() (See Section E.4.3). This
function clears the PG_reserved bit, sets the PG_high bit,
sets the count to 1, calls __free_pages() to give the page to the
buddy allocator and increments the totalhigh_pages count. Pages
which kill buggy Pentium Pro's are skipped
E.4.3 Function: one_highpage_init
Source: arch/i386/mm/init.c
This function initialises the information for one page in high memory
and checks to make sure that the page will not trigger a bug with some
Pentium Pros. It only exists if CONFIG_HIGHMEM is specified
at compile time.
449 #ifdef CONFIG_HIGHMEM
450 void __init one_highpage_init(struct page *page, int pfn,
int bad_ppro)
451 {
452 if (!page_is_ram(pfn)) {
453 SetPageReserved(page);
454 return;
455 }
456
457 if (bad_ppro && page_kills_ppro(pfn)) {
458 SetPageReserved(page);
459 return;
460 }
461
462 ClearPageReserved(page);
463 set_bit(PG_highmem, &page->flags);
464 atomic_set(&page->count, 1);
465 __free_page(page);
466 totalhigh_pages++;
467 }
468 #endif /* CONFIG_HIGHMEM */
- 452-455If a page does not exist at the PFN, then mark the
struct page as reserved so it will not be used
- 457-460If the running CPU is susceptible to the Pentium Pro bug and
this page is a page that would cause a crash (page_kills_ppro()
performs the check), then mark the page as reserved so it will never be
allocated
- 462From here on, the page is a high memory page that should be used so
first clear the reserved bit so it will be given to the buddy allocator later
- 463Set the PG_highmem bit to show it is a high memory
page
- 464Initialise the usage count of the page to 1 which will be set to 0
by the buddy allocator
- 465Free the page with
__free_page()(See Section F.4.2) so that the buddy allocator
will add the high memory page to it's free lists
- 466Increment the total number of available high memory pages
(totalhigh_pages)
E.4.4 Function: free_all_bootmem
Source: mm/bootmem.c
299 unsigned long __init free_all_bootmem_node (pg_data_t *pgdat)
300 {
301 return(free_all_bootmem_core(pgdat));
302 }
321 unsigned long __init free_all_bootmem (void)
322 {
323 return(free_all_bootmem_core(&contig_page_data));
324 }
- 299-302For NUMA, simply call the core function with the specified
pgdat
- 321-324For UMA, call the core function with the only node
contig_page_data
E.4.5 Function: free_all_bootmem_core
Source: mm/bootmem.c
This is the core function which “retires” the boot memory allocator. It is
divided into two major tasks
- For all unallocated pages known to the allocator for this node;
-
Clear the PG_reserved flag in its struct page
- Set the count to 1
- Call __free_pages() so that the buddy allocator can build
its free lists
- Free all pages used for the bitmap and free to them to the buddy
allocator
245 static unsigned long __init free_all_bootmem_core(pg_data_t *pgdat)
246 {
247 struct page *page = pgdat->node_mem_map;
248 bootmem_data_t *bdata = pgdat->bdata;
249 unsigned long i, count, total = 0;
250 unsigned long idx;
251
252 if (!bdata->node_bootmem_map) BUG();
253
254 count = 0;
255 idx = bdata->node_low_pfn -
(bdata->node_boot_start >> PAGE_SHIFT);
256 for (i = 0; i < idx; i++, page++) {
257 if (!test_bit(i, bdata->node_bootmem_map)) {
258 count++;
259 ClearPageReserved(page);
260 set_page_count(page, 1);
261 __free_page(page);
262 }
263 }
264 total += count;
- 252If no map is available, it means that this node has already been
freed and something woeful is wrong with the architecture dependent code so
call BUG()
- 254A running count of the number of pages given to the buddy allocator
- 255idx is the last index that is addressable by this node
- 256-263Cycle through all pages addressable by this node
- 257If the page is marked free then...
- 258Increase the running count of pages given to the buddy allocator
- 259Clear the PG_reserved flag
- 260Set the count to 1 so that the buddy allocator will think this is
the last user of the page and place it in its free lists
- 261Call the buddy allocator free function so the page will be added to
it's free lists
- 264total will come the total number of pages given over by
this function
270 page = virt_to_page(bdata->node_bootmem_map);
271 count = 0;
272 for (i = 0;
i < ((bdata->node_low_pfn - (bdata->node_boot_start >> PAGE_SHIFT)
)/8 + PAGE_SIZE-1)/PAGE_SIZE;
i++,page++) {
273 count++;
274 ClearPageReserved(page);
275 set_page_count(page, 1);
276 __free_page(page);
277 }
278 total += count;
279 bdata->node_bootmem_map = NULL;
280
281 return total;
282 }
Free the allocator bitmap and return
- 270Get the struct page that is at the beginning of the
bootmem map
- 271Count of pages freed by the bitmap
- 272-277For all pages used by the bitmap, free them to the buddy
allocator the same way the previous block of code did
- 279Set the bootmem map to NULL to prevent it been freed a second time
by accident
- 281Return the total number of pages freed by this function, or in other
words, return the number of pages that were added to the buddy allocator's free
lists