From 14f7ed057ed6359f6836ea87fa413ad40b9d51c8 Mon Sep 17 00:00:00 2001 From: Stephan Raue Date: Sun, 2 May 2010 18:47:20 +0200 Subject: [PATCH] linux: - remove tuxonice patch --- .../patches/tuxonice-3.1-for-2.6.33.diff | 21427 ---------------- 1 file changed, 21427 deletions(-) delete mode 100644 packages/linux/patches/tuxonice-3.1-for-2.6.33.diff diff --git a/packages/linux/patches/tuxonice-3.1-for-2.6.33.diff b/packages/linux/patches/tuxonice-3.1-for-2.6.33.diff deleted file mode 100644 index fdd7bee0d9..0000000000 --- a/packages/linux/patches/tuxonice-3.1-for-2.6.33.diff +++ /dev/null @@ -1,21427 +0,0 @@ -diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt -index e7848a0..616afc2 100644 ---- a/Documentation/kernel-parameters.txt -+++ b/Documentation/kernel-parameters.txt -@@ -2703,6 +2703,9 @@ and is between 256 and 4096 characters. It is defined in the file - medium is write-protected). - Example: quirks=0419:aaf5:rl,0421:0433:rc - -+ uuid_debug= (Boolean) whether to enable debugging of TuxOnIce's -+ uuid support. -+ - vdso= [X86,SH] - vdso=2: enable compat VDSO (default with COMPAT_VDSO) - vdso=1: enable VDSO (default) -diff --git a/Documentation/power/tuxonice-internals.txt b/Documentation/power/tuxonice-internals.txt -new file mode 100644 -index 0000000..7a96186 ---- /dev/null -+++ b/Documentation/power/tuxonice-internals.txt -@@ -0,0 +1,477 @@ -+ TuxOnIce 3.0 Internal Documentation. -+ Updated to 26 March 2009 -+ -+1. Introduction. -+ -+ TuxOnIce 3.0 is an addition to the Linux Kernel, designed to -+ allow the user to quickly shutdown and quickly boot a computer, without -+ needing to close documents or programs. It is equivalent to the -+ hibernate facility in some laptops. This implementation, however, -+ requires no special BIOS or hardware support. -+ -+ The code in these files is based upon the original implementation -+ prepared by Gabor Kuti and additional work by Pavel Machek and a -+ host of others. This code has been substantially reworked by Nigel -+ Cunningham, again with the help and testing of many others, not the -+ least of whom is Michael Frank. At its heart, however, the operation is -+ essentially the same as Gabor's version. -+ -+2. Overview of operation. -+ -+ The basic sequence of operations is as follows: -+ -+ a. Quiesce all other activity. -+ b. Ensure enough memory and storage space are available, and attempt -+ to free memory/storage if necessary. -+ c. Allocate the required memory and storage space. -+ d. Write the image. -+ e. Power down. -+ -+ There are a number of complicating factors which mean that things are -+ not as simple as the above would imply, however... -+ -+ o The activity of each process must be stopped at a point where it will -+ not be holding locks necessary for saving the image, or unexpectedly -+ restart operations due to something like a timeout and thereby make -+ our image inconsistent. -+ -+ o It is desirous that we sync outstanding I/O to disk before calculating -+ image statistics. This reduces corruption if one should suspend but -+ then not resume, and also makes later parts of the operation safer (see -+ below). -+ -+ o We need to get as close as we can to an atomic copy of the data. -+ Inconsistencies in the image will result in inconsistent memory contents at -+ resume time, and thus in instability of the system and/or file system -+ corruption. This would appear to imply a maximum image size of one half of -+ the amount of RAM, but we have a solution... (again, below). -+ -+ o In 2.6, we choose to play nicely with the other suspend-to-disk -+ implementations. -+ -+3. Detailed description of internals. -+ -+ a. Quiescing activity. -+ -+ Safely quiescing the system is achieved using three separate but related -+ aspects. -+ -+ First, we note that the vast majority of processes don't need to run during -+ suspend. They can be 'frozen'. We therefore implement a refrigerator -+ routine, which processes enter and in which they remain until the cycle is -+ complete. Processes enter the refrigerator via try_to_freeze() invocations -+ at appropriate places. A process cannot be frozen in any old place. It -+ must not be holding locks that will be needed for writing the image or -+ freezing other processes. For this reason, userspace processes generally -+ enter the refrigerator via the signal handling code, and kernel threads at -+ the place in their event loops where they drop locks and yield to other -+ processes or sleep. -+ -+ The task of freezing processes is complicated by the fact that there can be -+ interdependencies between processes. Freezing process A before process B may -+ mean that process B cannot be frozen, because it stops at waiting for -+ process A rather than in the refrigerator. This issue is seen where -+ userspace waits on freezeable kernel threads or fuse filesystem threads. To -+ address this issue, we implement the following algorithm for quiescing -+ activity: -+ -+ - Freeze filesystems (including fuse - userspace programs starting -+ new requests are immediately frozen; programs already running -+ requests complete their work before being frozen in the next -+ step) -+ - Freeze userspace -+ - Thaw filesystems (this is safe now that userspace is frozen and no -+ fuse requests are outstanding). -+ - Invoke sys_sync (noop on fuse). -+ - Freeze filesystems -+ - Freeze kernel threads -+ -+ If we need to free memory, we thaw kernel threads and filesystems, but not -+ userspace. We can then free caches without worrying about deadlocks due to -+ swap files being on frozen filesystems or such like. -+ -+ b. Ensure enough memory & storage are available. -+ -+ We have a number of constraints to meet in order to be able to successfully -+ suspend and resume. -+ -+ First, the image will be written in two parts, described below. One of these -+ parts needs to have an atomic copy made, which of course implies a maximum -+ size of one half of the amount of system memory. The other part ('pageset') -+ is not atomically copied, and can therefore be as large or small as desired. -+ -+ Second, we have constraints on the amount of storage available. In these -+ calculations, we may also consider any compression that will be done. The -+ cryptoapi module allows the user to configure an expected compression ratio. -+ -+ Third, the user can specify an arbitrary limit on the image size, in -+ megabytes. This limit is treated as a soft limit, so that we don't fail the -+ attempt to suspend if we cannot meet this constraint. -+ -+ c. Allocate the required memory and storage space. -+ -+ Having done the initial freeze, we determine whether the above constraints -+ are met, and seek to allocate the metadata for the image. If the constraints -+ are not met, or we fail to allocate the required space for the metadata, we -+ seek to free the amount of memory that we calculate is needed and try again. -+ We allow up to four iterations of this loop before aborting the cycle. If we -+ do fail, it should only be because of a bug in TuxOnIce's calculations. -+ -+ These steps are merged together in the prepare_image function, found in -+ prepare_image.c. The functions are merged because of the cyclical nature -+ of the problem of calculating how much memory and storage is needed. Since -+ the data structures containing the information about the image must -+ themselves take memory and use storage, the amount of memory and storage -+ required changes as we prepare the image. Since the changes are not large, -+ only one or two iterations will be required to achieve a solution. -+ -+ The recursive nature of the algorithm is miminised by keeping user space -+ frozen while preparing the image, and by the fact that our records of which -+ pages are to be saved and which pageset they are saved in use bitmaps (so -+ that changes in number or fragmentation of the pages to be saved don't -+ feedback via changes in the amount of memory needed for metadata). The -+ recursiveness is thus limited to any extra slab pages allocated to store the -+ extents that record storage used, and the effects of seeking to free memory. -+ -+ d. Write the image. -+ -+ We previously mentioned the need to create an atomic copy of the data, and -+ the half-of-memory limitation that is implied in this. This limitation is -+ circumvented by dividing the memory to be saved into two parts, called -+ pagesets. -+ -+ Pageset2 contains most of the page cache - the pages on the active and -+ inactive LRU lists that aren't needed or modified while TuxOnIce is -+ running, so they can be safely written without an atomic copy. They are -+ therefore saved first and reloaded last. While saving these pages, -+ TuxOnIce carefully ensures that the work of writing the pages doesn't make -+ the image inconsistent. With the support for Kernel (Video) Mode Setting -+ going into the kernel at the time of writing, we need to check for pages -+ on the LRU that are used by KMS, and exclude them from pageset2. They are -+ atomically copied as part of pageset 1. -+ -+ Once pageset2 has been saved, we prepare to do the atomic copy of remaining -+ memory. As part of the preparation, we power down drivers, thereby providing -+ them with the opportunity to have their state recorded in the image. The -+ amount of memory allocated by drivers for this is usually negligible, but if -+ DRI is in use, video drivers may require significants amounts. Ideally we -+ would be able to query drivers while preparing the image as to the amount of -+ memory they will need. Unfortunately no such mechanism exists at the time of -+ writing. For this reason, TuxOnIce allows the user to set an -+ 'extra_pages_allowance', which is used to seek to ensure sufficient memory -+ is available for drivers at this point. TuxOnIce also lets the user set this -+ value to 0. In this case, a test driver suspend is done while preparing the -+ image, and the difference (plus a margin) used instead. TuxOnIce will also -+ automatically restart the hibernation process (twice at most) if it finds -+ that the extra pages allowance is not sufficient. It will then use what was -+ actually needed (plus a margin, again). Failure to hibernate should thus -+ be an extremely rare occurence. -+ -+ Having suspended the drivers, we save the CPU context before making an -+ atomic copy of pageset1, resuming the drivers and saving the atomic copy. -+ After saving the two pagesets, we just need to save our metadata before -+ powering down. -+ -+ As we mentioned earlier, the contents of pageset2 pages aren't needed once -+ they've been saved. We therefore use them as the destination of our atomic -+ copy. In the unlikely event that pageset1 is larger, extra pages are -+ allocated while the image is being prepared. This is normally only a real -+ possibility when the system has just been booted and the page cache is -+ small. -+ -+ This is where we need to be careful about syncing, however. Pageset2 will -+ probably contain filesystem meta data. If this is overwritten with pageset1 -+ and then a sync occurs, the filesystem will be corrupted - at least until -+ resume time and another sync of the restored data. Since there is a -+ possibility that the user might not resume or (may it never be!) that -+ TuxOnIce might oops, we do our utmost to avoid syncing filesystems after -+ copying pageset1. -+ -+ e. Power down. -+ -+ Powering down uses standard kernel routines. TuxOnIce supports powering down -+ using the ACPI S3, S4 and S5 methods or the kernel's non-ACPI power-off. -+ Supporting suspend to ram (S3) as a power off option might sound strange, -+ but it allows the user to quickly get their system up and running again if -+ the battery doesn't run out (we just need to re-read the overwritten pages) -+ and if the battery does run out (or the user removes power), they can still -+ resume. -+ -+4. Data Structures. -+ -+ TuxOnIce uses three main structures to store its metadata and configuration -+ information: -+ -+ a) Pageflags bitmaps. -+ -+ TuxOnIce records which pages will be in pageset1, pageset2, the destination -+ of the atomic copy and the source of the atomically restored image using -+ bitmaps. The code used is that written for swsusp, with small improvements -+ to match TuxOnIce's requirements. -+ -+ The pageset1 bitmap is thus easily stored in the image header for use at -+ resume time. -+ -+ As mentioned above, using bitmaps also means that the amount of memory and -+ storage required for recording the above information is constant. This -+ greatly simplifies the work of preparing the image. In earlier versions of -+ TuxOnIce, extents were used to record which pages would be stored. In that -+ case, however, eating memory could result in greater fragmentation of the -+ lists of pages, which in turn required more memory to store the extents and -+ more storage in the image header. These could in turn require further -+ freeing of memory, and another iteration. All of this complexity is removed -+ by having bitmaps. -+ -+ Bitmaps also make a lot of sense because TuxOnIce only ever iterates -+ through the lists. There is therefore no cost to not being able to find the -+ nth page in order 0 time. We only need to worry about the cost of finding -+ the n+1th page, given the location of the nth page. Bitwise optimisations -+ help here. -+ -+ b) Extents for block data. -+ -+ TuxOnIce supports writing the image to multiple block devices. In the case -+ of swap, multiple partitions and/or files may be in use, and we happily use -+ them all (with the exception of compcache pages, which we allocate but do -+ not use). This use of multiple block devices is accomplished as follows: -+ -+ Whatever the actual source of the allocated storage, the destination of the -+ image can be viewed in terms of one or more block devices, and on each -+ device, a list of sectors. To simplify matters, we only use contiguous, -+ PAGE_SIZE aligned sectors, like the swap code does. -+ -+ Since sector numbers on each bdev may well not start at 0, it makes much -+ more sense to use extents here. Contiguous ranges of pages can thus be -+ represented in the extents by contiguous values. -+ -+ Variations in block size are taken account of in transforming this data -+ into the parameters for bio submission. -+ -+ We can thus implement a layer of abstraction wherein the core of TuxOnIce -+ doesn't have to worry about which device we're currently writing to or -+ where in the device we are. It simply requests that the next page in the -+ pageset or header be written, leaving the details to this lower layer. -+ The lower layer remembers where in the sequence of devices and blocks each -+ pageset starts. The header always starts at the beginning of the allocated -+ storage. -+ -+ So extents are: -+ -+ struct extent { -+ unsigned long minimum, maximum; -+ struct extent *next; -+ } -+ -+ These are combined into chains of extents for a device: -+ -+ struct extent_chain { -+ int size; /* size of the extent ie sum (max-min+1) */ -+ int allocs, frees; -+ char *name; -+ struct extent *first, *last_touched; -+ }; -+ -+ For each bdev, we need to store a little more info: -+ -+ struct suspend_bdev_info { -+ struct block_device *bdev; -+ dev_t dev_t; -+ int bmap_shift; -+ int blocks_per_page; -+ }; -+ -+ The dev_t is used to identify the device in the stored image. As a result, -+ we expect devices at resume time to have the same major and minor numbers -+ as they had while suspending. This is primarily a concern where the user -+ utilises LVM for storage, as they will need to dmsetup their partitions in -+ such a way as to maintain this consistency at resume time. -+ -+ bmap_shift and blocks_per_page apply the effects of variations in blocks -+ per page settings for the filesystem and underlying bdev. For most -+ filesystems, these are the same, but for xfs, they can have independant -+ values. -+ -+ Combining these two structures together, we have everything we need to -+ record what devices and what blocks on each device are being used to -+ store the image, and to submit i/o using bio_submit. -+ -+ The last elements in the picture are a means of recording how the storage -+ is being used. -+ -+ We do this first and foremost by implementing a layer of abstraction on -+ top of the devices and extent chains which allows us to view however many -+ devices there might be as one long storage tape, with a single 'head' that -+ tracks a 'current position' on the tape: -+ -+ struct extent_iterate_state { -+ struct extent_chain *chains; -+ int num_chains; -+ int current_chain; -+ struct extent *current_extent; -+ unsigned long current_offset; -+ }; -+ -+ That is, *chains points to an array of size num_chains of extent chains. -+ For the filewriter, this is always a single chain. For the swapwriter, the -+ array is of size MAX_SWAPFILES. -+ -+ current_chain, current_extent and current_offset thus point to the current -+ index in the chains array (and into a matching array of struct -+ suspend_bdev_info), the current extent in that chain (to optimise access), -+ and the current value in the offset. -+ -+ The image is divided into three parts: -+ - The header -+ - Pageset 1 -+ - Pageset 2 -+ -+ The header always starts at the first device and first block. We know its -+ size before we begin to save the image because we carefully account for -+ everything that will be stored in it. -+ -+ The second pageset (LRU) is stored first. It begins on the next page after -+ the end of the header. -+ -+ The first pageset is stored second. It's start location is only known once -+ pageset2 has been saved, since pageset2 may be compressed as it is written. -+ This location is thus recorded at the end of saving pageset2. It is page -+ aligned also. -+ -+ Since this information is needed at resume time, and the location of extents -+ in memory will differ at resume time, this needs to be stored in a portable -+ way: -+ -+ struct extent_iterate_saved_state { -+ int chain_num; -+ int extent_num; -+ unsigned long offset; -+ }; -+ -+ We can thus implement a layer of abstraction wherein the core of TuxOnIce -+ doesn't have to worry about which device we're currently writing to or -+ where in the device we are. It simply requests that the next page in the -+ pageset or header be written, leaving the details to this layer, and -+ invokes the routines to remember and restore the position, without having -+ to worry about the details of how the data is arranged on disk or such like. -+ -+ c) Modules -+ -+ One aim in designing TuxOnIce was to make it flexible. We wanted to allow -+ for the implementation of different methods of transforming a page to be -+ written to disk and different methods of getting the pages stored. -+ -+ In early versions (the betas and perhaps Suspend1), compression support was -+ inlined in the image writing code, and the data structures and code for -+ managing swap were intertwined with the rest of the code. A number of people -+ had expressed interest in implementing image encryption, and alternative -+ methods of storing the image. -+ -+ In order to achieve this, TuxOnIce was given a modular design. -+ -+ A module is a single file which encapsulates the functionality needed -+ to transform a pageset of data (encryption or compression, for example), -+ or to write the pageset to a device. The former type of module is called -+ a 'page-transformer', the later a 'writer'. -+ -+ Modules are linked together in pipeline fashion. There may be zero or more -+ page transformers in a pipeline, and there is always exactly one writer. -+ The pipeline follows this pattern: -+ -+ --------------------------------- -+ | TuxOnIce Core | -+ --------------------------------- -+ | -+ | -+ --------------------------------- -+ | Page transformer 1 | -+ --------------------------------- -+ | -+ | -+ --------------------------------- -+ | Page transformer 2 | -+ --------------------------------- -+ | -+ | -+ --------------------------------- -+ | Writer | -+ --------------------------------- -+ -+ During the writing of an image, the core code feeds pages one at a time -+ to the first module. This module performs whatever transformations it -+ implements on the incoming data, completely consuming the incoming data and -+ feeding output in a similar manner to the next module. -+ -+ All routines are SMP safe, and the final result of the transformations is -+ written with an index (provided by the core) and size of the output by the -+ writer. As a result, we can have multithreaded I/O without needing to -+ worry about the sequence in which pages are written (or read). -+ -+ During reading, the pipeline works in the reverse direction. The core code -+ calls the first module with the address of a buffer which should be filled. -+ (Note that the buffer size is always PAGE_SIZE at this time). This module -+ will in turn request data from the next module and so on down until the -+ writer is made to read from the stored image. -+ -+ Part of definition of the structure of a module thus looks like this: -+ -+ int (*rw_init) (int rw, int stream_number); -+ int (*rw_cleanup) (int rw); -+ int (*write_chunk) (struct page *buffer_page); -+ int (*read_chunk) (struct page *buffer_page, int sync); -+ -+ It should be noted that the _cleanup routine may be called before the -+ full stream of data has been read or written. While writing the image, -+ the user may (depending upon settings) choose to abort suspending, and -+ if we are in the midst of writing the last portion of the image, a portion -+ of the second pageset may be reread. This may also happen if an error -+ occurs and we seek to abort the process of writing the image. -+ -+ The modular design is also useful in a number of other ways. It provides -+ a means where by we can add support for: -+ -+ - providing overall initialisation and cleanup routines; -+ - serialising configuration information in the image header; -+ - providing debugging information to the user; -+ - determining memory and image storage requirements; -+ - dis/enabling components at run-time; -+ - configuring the module (see below); -+ -+ ...and routines for writers specific to their work: -+ - Parsing a resume= location; -+ - Determining whether an image exists; -+ - Marking a resume as having been attempted; -+ - Invalidating an image; -+ -+ Since some parts of the core - the user interface and storage manager -+ support - have use for some of these functions, they are registered as -+ 'miscellaneous' modules as well. -+ -+ d) Sysfs data structures. -+ -+ This brings us naturally to support for configuring TuxOnIce. We desired to -+ provide a way to make TuxOnIce as flexible and configurable as possible. -+ The user shouldn't have to reboot just because they want to now hibernate to -+ a file instead of a partition, for example. -+ -+ To accomplish this, TuxOnIce implements a very generic means whereby the -+ core and modules can register new sysfs entries. All TuxOnIce entries use -+ a single _store and _show routine, both of which are found in -+ tuxonice_sysfs.c in the kernel/power directory. These routines handle the -+ most common operations - getting and setting the values of bits, integers, -+ longs, unsigned longs and strings in one place, and allow overrides for -+ customised get and set options as well as side-effect routines for all -+ reads and writes. -+ -+ When combined with some simple macros, a new sysfs entry can then be defined -+ in just a couple of lines: -+ -+ SYSFS_INT("progress_granularity", SYSFS_RW, &progress_granularity, 1, -+ 2048, 0, NULL), -+ -+ This defines a sysfs entry named "progress_granularity" which is rw and -+ allows the user to access an integer stored at &progress_granularity, giving -+ it a value between 1 and 2048 inclusive. -+ -+ Sysfs entries are registered under /sys/power/tuxonice, and entries for -+ modules are located in a subdirectory named after the module. -+ -diff --git a/Documentation/power/tuxonice.txt b/Documentation/power/tuxonice.txt -new file mode 100644 -index 0000000..3bf0575 ---- /dev/null -+++ b/Documentation/power/tuxonice.txt -@@ -0,0 +1,948 @@ -+ --- TuxOnIce, version 3.0 --- -+ -+1. What is it? -+2. Why would you want it? -+3. What do you need to use it? -+4. Why not just use the version already in the kernel? -+5. How do you use it? -+6. What do all those entries in /sys/power/tuxonice do? -+7. How do you get support? -+8. I think I've found a bug. What should I do? -+9. When will XXX be supported? -+10 How does it work? -+11. Who wrote TuxOnIce? -+ -+1. What is it? -+ -+ Imagine you're sitting at your computer, working away. For some reason, you -+ need to turn off your computer for a while - perhaps it's time to go home -+ for the day. When you come back to your computer next, you're going to want -+ to carry on where you left off. Now imagine that you could push a button and -+ have your computer store the contents of its memory to disk and power down. -+ Then, when you next start up your computer, it loads that image back into -+ memory and you can carry on from where you were, just as if you'd never -+ turned the computer off. You have far less time to start up, no reopening of -+ applications or finding what directory you put that file in yesterday. -+ That's what TuxOnIce does. -+ -+ TuxOnIce has a long heritage. It began life as work by Gabor Kuti, who, -+ with some help from Pavel Machek, got an early version going in 1999. The -+ project was then taken over by Florent Chabaud while still in alpha version -+ numbers. Nigel Cunningham came on the scene when Florent was unable to -+ continue, moving the project into betas, then 1.0, 2.0 and so on up to -+ the present series. During the 2.0 series, the name was contracted to -+ Suspend2 and the website suspend2.net created. Beginning around July 2007, -+ a transition to calling the software TuxOnIce was made, to seek to help -+ make it clear that TuxOnIce is more concerned with hibernation than suspend -+ to ram. -+ -+ Pavel Machek's swsusp code, which was merged around 2.5.17 retains the -+ original name, and was essentially a fork of the beta code until Rafael -+ Wysocki came on the scene in 2005 and began to improve it further. -+ -+2. Why would you want it? -+ -+ Why wouldn't you want it? -+ -+ Being able to save the state of your system and quickly restore it improves -+ your productivity - you get a useful system in far less time than through -+ the normal boot process. You also get to be completely 'green', using zero -+ power, or as close to that as possible (the computer may still provide -+ minimal power to some devices, so they can initiate a power on, but that -+ will be the same amount of power as would be used if you told the computer -+ to shutdown. -+ -+3. What do you need to use it? -+ -+ a. Kernel Support. -+ -+ i) The TuxOnIce patch. -+ -+ TuxOnIce is part of the Linux Kernel. This version is not part of Linus's -+ 2.6 tree at the moment, so you will need to download the kernel source and -+ apply the latest patch. Having done that, enable the appropriate options in -+ make [menu|x]config (under Power Management Options - look for "Enhanced -+ Hibernation"), compile and install your kernel. TuxOnIce works with SMP, -+ Highmem, preemption, fuse filesystems, x86-32, PPC and x86_64. -+ -+ TuxOnIce patches are available from http://tuxonice.net. -+ -+ ii) Compression support. -+ -+ Compression support is implemented via the cryptoapi. You will therefore want -+ to select any Cryptoapi transforms that you want to use on your image from -+ the Cryptoapi menu while configuring your kernel. We recommend the use of the -+ LZO compression method - it is very fast and still achieves good compression. -+ -+ You can also tell TuxOnIce to write its image to an encrypted and/or -+ compressed filesystem/swap partition. In that case, you don't need to do -+ anything special for TuxOnIce when it comes to kernel configuration. -+ -+ iii) Configuring other options. -+ -+ While you're configuring your kernel, try to configure as much as possible -+ to build as modules. We recommend this because there are a number of drivers -+ that are still in the process of implementing proper power management -+ support. In those cases, the best way to work around their current lack is -+ to build them as modules and remove the modules while hibernating. You might -+ also bug the driver authors to get their support up to speed, or even help! -+ -+ b. Storage. -+ -+ i) Swap. -+ -+ TuxOnIce can store the hibernation image in your swap partition, a swap file or -+ a combination thereof. Whichever combination you choose, you will probably -+ want to create enough swap space to store the largest image you could have, -+ plus the space you'd normally use for swap. A good rule of thumb would be -+ to calculate the amount of swap you'd want without using TuxOnIce, and then -+ add the amount of memory you have. This swapspace can be arranged in any way -+ you'd like. It can be in one partition or file, or spread over a number. The -+ only requirement is that they be active when you start a hibernation cycle. -+ -+ There is one exception to this requirement. TuxOnIce has the ability to turn -+ on one swap file or partition at the start of hibernating and turn it back off -+ at the end. If you want to ensure you have enough memory to store a image -+ when your memory is fully used, you might want to make one swap partition or -+ file for 'normal' use, and another for TuxOnIce to activate & deactivate -+ automatically. (Further details below). -+ -+ ii) Normal files. -+ -+ TuxOnIce includes a 'file allocator'. The file allocator can store your -+ image in a simple file. Since Linux has the concept of everything being a -+ file, this is more powerful than it initially sounds. If, for example, you -+ were to set up a network block device file, you could hibernate to a network -+ server. This has been tested and works to a point, but nbd itself isn't -+ stateless enough for our purposes. -+ -+ Take extra care when setting up the file allocator. If you just type -+ commands without thinking and then try to hibernate, you could cause -+ irreversible corruption on your filesystems! Make sure you have backups. -+ -+ Most people will only want to hibernate to a local file. To achieve that, do -+ something along the lines of: -+ -+ echo "TuxOnIce" > /hibernation-file -+ dd if=/dev/zero bs=1M count=512 >> /hibernation-file -+ -+ This will create a 512MB file called /hibernation-file. To get TuxOnIce to use -+ it: -+ -+ echo /hibernation-file > /sys/power/tuxonice/file/target -+ -+ Then -+ -+ cat /sys/power/tuxonice/resume -+ -+ Put the results of this into your bootloader's configuration (see also step -+ C, below): -+ -+ ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE--- -+ # cat /sys/power/tuxonice/resume -+ file:/dev/hda2:0x1e001 -+ -+ In this example, we would edit the append= line of our lilo.conf|menu.lst -+ so that it included: -+ -+ resume=file:/dev/hda2:0x1e001 -+ ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE--- -+ -+ For those who are thinking 'Could I make the file sparse?', the answer is -+ 'No!'. At the moment, there is no way for TuxOnIce to fill in the holes in -+ a sparse file while hibernating. In the longer term (post merge!), I'd like -+ to change things so that the file could be dynamically resized and have -+ holes filled as needed. Right now, however, that's not possible and not a -+ priority. -+ -+ c. Bootloader configuration. -+ -+ Using TuxOnIce also requires that you add an extra parameter to -+ your lilo.conf or equivalent. Here's an example for a swap partition: -+ -+ append="resume=swap:/dev/hda1" -+ -+ This would tell TuxOnIce that /dev/hda1 is a swap partition you -+ have. TuxOnIce will use the swap signature of this partition as a -+ pointer to your data when you hibernate. This means that (in this example) -+ /dev/hda1 doesn't need to be _the_ swap partition where all of your data -+ is actually stored. It just needs to be a swap partition that has a -+ valid signature. -+ -+ You don't need to have a swap partition for this purpose. TuxOnIce -+ can also use a swap file, but usage is a little more complex. Having made -+ your swap file, turn it on and do -+ -+ cat /sys/power/tuxonice/swap/headerlocations -+ -+ (this assumes you've already compiled your kernel with TuxOnIce -+ support and booted it). The results of the cat command will tell you -+ what you need to put in lilo.conf: -+ -+ For swap partitions like /dev/hda1, simply use resume=/dev/hda1. -+ For swapfile `swapfile`, use resume=swap:/dev/hda2:0x242d. -+ -+ If the swapfile changes for any reason (it is moved to a different -+ location, it is deleted and recreated, or the filesystem is -+ defragmented) then you will have to check -+ /sys/power/tuxonice/swap/headerlocations for a new resume_block value. -+ -+ Once you've compiled and installed the kernel and adjusted your bootloader -+ configuration, you should only need to reboot for the most basic part -+ of TuxOnIce to be ready. -+ -+ If you only compile in the swap allocator, or only compile in the file -+ allocator, you don't need to add the "swap:" part of the resume= -+ parameters above. resume=/dev/hda2:0x242d will work just as well. If you -+ have compiled both and your storage is on swap, you can also use this -+ format (the swap allocator is the default allocator). -+ -+ When compiling your kernel, one of the options in the 'Power Management -+ Support' menu, just above the 'Enhanced Hibernation (TuxOnIce)' entry is -+ called 'Default resume partition'. This can be used to set a default value -+ for the resume= parameter. -+ -+ d. The hibernate script. -+ -+ Since the driver model in 2.6 kernels is still being developed, you may need -+ to do more than just configure TuxOnIce. Users of TuxOnIce usually start the -+ process via a script which prepares for the hibernation cycle, tells the -+ kernel to do its stuff and then restore things afterwards. This script might -+ involve: -+ -+ - Switching to a text console and back if X doesn't like the video card -+ status on resume. -+ - Un/reloading drivers that don't play well with hibernation. -+ -+ Note that you might not be able to unload some drivers if there are -+ processes using them. You might have to kill off processes that hold -+ devices open. Hint: if your X server accesses an USB mouse, doing a -+ 'chvt' to a text console releases the device and you can unload the -+ module. -+ -+ Check out the latest script (available on tuxonice.net). -+ -+ e. The userspace user interface. -+ -+ TuxOnIce has very limited support for displaying status if you only apply -+ the kernel patch - it can printk messages, but that is all. In addition, -+ some of the functions mentioned in this document (such as cancelling a cycle -+ or performing interactive debugging) are unavailable. To utilise these -+ functions, or simply get a nice display, you need the 'userui' component. -+ Userui comes in three flavours, usplash, fbsplash and text. Text should -+ work on any console. Usplash and fbsplash require the appropriate -+ (distro specific?) support. -+ -+ To utilise a userui, TuxOnIce just needs to be told where to find the -+ userspace binary: -+ -+ echo "/usr/local/sbin/tuxoniceui_fbsplash" > /sys/power/tuxonice/user_interface/program -+ -+ The hibernate script can do this for you, and a default value for this -+ setting can be configured when compiling the kernel. This path is also -+ stored in the image header, so if you have an initrd or initramfs, you can -+ use the userui during the first part of resuming (prior to the atomic -+ restore) by putting the binary in the same path in your initrd/ramfs. -+ Alternatively, you can put it in a different location and do an echo -+ similar to the above prior to the echo > do_resume. The value saved in the -+ image header will then be ignored. -+ -+4. Why not just use the version already in the kernel? -+ -+ The version in the vanilla kernel has a number of drawbacks. The most -+ serious of these are: -+ - it has a maximum image size of 1/2 total memory; -+ - it doesn't allocate storage until after it has snapshotted memory. -+ This means that you can't be sure hibernating will work until you -+ see it start to write the image; -+ - it does not allow you to press escape to cancel a cycle; -+ - it does not allow you to press escape to cancel resuming; -+ - it does not allow you to automatically swapon a file when -+ starting a cycle; -+ - it does not allow you to use multiple swap partitions or files; -+ - it does not allow you to use ordinary files; -+ - it just invalidates an image and continues to boot if you -+ accidentally boot the wrong kernel after hibernating; -+ - it doesn't support any sort of nice display while hibernating; -+ - it is moving toward requiring that you have an initrd/initramfs -+ to ever have a hope of resuming (uswsusp). While uswsusp will -+ address some of the concerns above, it won't address all of them, -+ and will be more complicated to get set up; -+ - it doesn't have support for suspend-to-both (write a hibernation -+ image, then suspend to ram; I think this is known as ReadySafe -+ under M$). -+ -+5. How do you use it? -+ -+ A hibernation cycle can be started directly by doing: -+ -+ echo > /sys/power/tuxonice/do_hibernate -+ -+ In practice, though, you'll probably want to use the hibernate script -+ to unload modules, configure the kernel the way you like it and so on. -+ In that case, you'd do (as root): -+ -+ hibernate -+ -+ See the hibernate script's man page for more details on the options it -+ takes. -+ -+ If you're using the text or splash user interface modules, one feature of -+ TuxOnIce that you might find useful is that you can press Escape at any time -+ during hibernating, and the process will be aborted. -+ -+ Due to the way hibernation works, this means you'll have your system back and -+ perfectly usable almost instantly. The only exception is when it's at the -+ very end of writing the image. Then it will need to reload a small (usually -+ 4-50MBs, depending upon the image characteristics) portion first. -+ -+ Likewise, when resuming, you can press escape and resuming will be aborted. -+ The computer will then powerdown again according to settings at that time for -+ the powerdown method or rebooting. -+ -+ You can change the settings for powering down while the image is being -+ written by pressing 'R' to toggle rebooting and 'O' to toggle between -+ suspending to ram and powering down completely). -+ -+ If you run into problems with resuming, adding the "noresume" option to -+ the kernel command line will let you skip the resume step and recover your -+ system. This option shouldn't normally be needed, because TuxOnIce modifies -+ the image header prior to the atomic restore, and will thus prompt you -+ if it detects that you've tried to resume an image before (this flag is -+ removed if you press Escape to cancel a resume, so you won't be prompted -+ then). -+ -+ Recent kernels (2.6.24 onwards) add support for resuming from a different -+ kernel to the one that was hibernated (thanks to Rafael for his work on -+ this - I've just embraced and enhanced the support for TuxOnIce). This -+ should further reduce the need for you to use the noresume option. -+ -+6. What do all those entries in /sys/power/tuxonice do? -+ -+ /sys/power/tuxonice is the directory which contains files you can use to -+ tune and configure TuxOnIce to your liking. The exact contents of -+ the directory will depend upon the version of TuxOnIce you're -+ running and the options you selected at compile time. In the following -+ descriptions, names in brackets refer to compile time options. -+ (Note that they're all dependant upon you having selected CONFIG_TUXONICE -+ in the first place!). -+ -+ Since the values of these settings can open potential security risks, the -+ writeable ones are accessible only to the root user. You may want to -+ configure sudo to allow you to invoke your hibernate script as an ordinary -+ user. -+ -+ - alloc/failure_test -+ -+ This debugging option provides a way of testing TuxOnIce's handling of -+ memory allocation failures. Each allocation type that TuxOnIce makes has -+ been given a unique number (see the source code). Echo the appropriate -+ number into this entry, and when TuxOnIce attempts to do that allocation, -+ it will pretend there was a failure and act accordingly. -+ -+ - alloc/find_max_mem_allocated -+ -+ This debugging option will cause TuxOnIce to find the maximum amount of -+ memory it used during a cycle, and report that information in debugging -+ information at the end of the cycle. -+ -+ - alt_resume_param -+ -+ Instead of powering down after writing a hibernation image, TuxOnIce -+ supports resuming from a different image. This entry lets you set the -+ location of the signature for that image (the resume= value you'd use -+ for it). Using an alternate image and keep_image mode, you can do things -+ like using an alternate image to power down an uninterruptible power -+ supply. -+ -+ - block_io/target_outstanding_io -+ -+ This value controls the amount of memory that the block I/O code says it -+ needs when the core code is calculating how much memory is needed for -+ hibernating and for resuming. It doesn't directly control the amount of -+ I/O that is submitted at any one time - that depends on the amount of -+ available memory (we may have more available than we asked for), the -+ throughput that is being achieved and the ability of the CPU to keep up -+ with disk throughput (particularly where we're compressing pages). -+ -+ - checksum/enabled -+ -+ Use cryptoapi hashing routines to verify that Pageset2 pages don't change -+ while we're saving the first part of the image, and to get any pages that -+ do change resaved in the atomic copy. This should normally not be needed, -+ but if you're seeing issues, please enable this. If your issues stop you -+ being able to resume, enable this option, hibernate and cancel the cycle -+ after the atomic copy is done. If the debugging info shows a non-zero -+ number of pages resaved, please report this to Nigel. -+ -+ - compression/algorithm -+ -+ Set the cryptoapi algorithm used for compressing the image. -+ -+ - compression/expected_compression -+ -+ These values allow you to set an expected compression ratio, which TuxOnice -+ will use in calculating whether it meets constraints on the image size. If -+ this expected compression ratio is not attained, the hibernation cycle will -+ abort, so it is wise to allow some spare. You can see what compression -+ ratio is achieved in the logs after hibernating. -+ -+ - debug_info: -+ -+ This file returns information about your configuration that may be helpful -+ in diagnosing problems with hibernating. -+ -+ - did_suspend_to_both: -+ -+ This file can be used when you hibernate with powerdown method 3 (ie suspend -+ to ram after writing the image). There can be two outcomes in this case. We -+ can resume from the suspend-to-ram before the battery runs out, or we can run -+ out of juice and and up resuming like normal. This entry lets you find out, -+ post resume, which way we went. If the value is 1, we resumed from suspend -+ to ram. This can be useful when actions need to be run post suspend-to-ram -+ that don't need to be run if we did the normal resume from power off. -+ -+ - do_hibernate: -+ -+ When anything is written to this file, the kernel side of TuxOnIce will -+ begin to attempt to write an image to disk and power down. You'll normally -+ want to run the hibernate script instead, to get modules unloaded first. -+ -+ - do_resume: -+ -+ When anything is written to this file TuxOnIce will attempt to read and -+ restore an image. If there is no image, it will return almost immediately. -+ If an image exists, the echo > will never return. Instead, the original -+ kernel context will be restored and the original echo > do_hibernate will -+ return. -+ -+ - */enabled -+ -+ These option can be used to temporarily disable various parts of TuxOnIce. -+ -+ - extra_pages_allowance -+ -+ When TuxOnIce does its atomic copy, it calls the driver model suspend -+ and resume methods. If you have DRI enabled with a driver such as fglrx, -+ this can result in the driver allocating a substantial amount of memory -+ for storing its state. Extra_pages_allowance tells TuxOnIce how much -+ extra memory it should ensure is available for those allocations. If -+ your attempts at hibernating end with a message in dmesg indicating that -+ insufficient extra pages were allowed, you need to increase this value. -+ -+ - file/target: -+ -+ Read this value to get the current setting. Write to it to point TuxOnice -+ at a new storage location for the file allocator. See section 3.b.ii above -+ for details of how to set up the file allocator. -+ -+ - freezer_test -+ -+ This entry can be used to get TuxOnIce to just test the freezer and prepare -+ an image without actually doing a hibernation cycle. It is useful for -+ diagnosing freezing and image preparation issues. -+ -+ - full_pageset2 -+ -+ TuxOnIce divides the pages that are stored in an image into two sets. The -+ difference between the two sets is that pages in pageset 1 are atomically -+ copied, and pages in pageset 2 are written to disk without being copied -+ first. A page CAN be written to disk without being copied first if and only -+ if its contents will not be modified or used at any time after userspace -+ processes are frozen. A page MUST be in pageset 1 if its contents are -+ modified or used at any time after userspace processes have been frozen. -+ -+ Normally (ie if this option is enabled), TuxOnIce will put all pages on the -+ per-zone LRUs in pageset2, then remove those pages used by any userspace -+ user interface helper and TuxOnIce storage manager that are running, -+ together with pages used by the GEM memory manager introduced around 2.6.28 -+ kernels. -+ -+ If this option is disabled, a much more conservative approach will be taken. -+ The only pages in pageset2 will be those belonging to userspace processes, -+ with the exclusion of those belonging to the TuxOnIce userspace helpers -+ mentioned above. This will result in a much smaller pageset2, and will -+ therefore result in smaller images than are possible with this option -+ enabled. -+ -+ - ignore_rootfs -+ -+ TuxOnIce records which device is mounted as the root filesystem when -+ writing the hibernation image. It will normally check at resume time that -+ this device isn't already mounted - that would be a cause of filesystem -+ corruption. In some particular cases (RAM based root filesystems), you -+ might want to disable this check. This option allows you to do that. -+ -+ - image_exists: -+ -+ Can be used in a script to determine whether a valid image exists at the -+ location currently pointed to by resume=. Returns up to three lines. -+ The first is whether an image exists (-1 for unsure, otherwise 0 or 1). -+ If an image eixsts, additional lines will return the machine and version. -+ Echoing anything to this entry removes any current image. -+ -+ - image_size_limit: -+ -+ The maximum size of hibernation image written to disk, measured in megabytes -+ (1024*1024). -+ -+ - last_result: -+ -+ The result of the last hibernation cycle, as defined in -+ include/linux/suspend-debug.h with the values SUSPEND_ABORTED to -+ SUSPEND_KEPT_IMAGE. This is a bitmask. -+ -+ - late_cpu_hotplug: -+ -+ This sysfs entry controls whether cpu hotplugging is done - as normal - just -+ before (unplug) and after (replug) the atomic copy/restore (so that all -+ CPUs/cores are available for multithreaded I/O). The alternative is to -+ unplug all secondary CPUs/cores at the start of hibernating/resuming, and -+ replug them at the end of resuming. No multithreaded I/O will be possible in -+ this configuration, but the odd machine has been reported to require it. -+ -+ - lid_file: -+ -+ This determines which ACPI button file we look in to determine whether the -+ lid is open or closed after resuming from suspend to disk or power off. -+ If the entry is set to "lid/LID", we'll open /proc/acpi/button/lid/LID/state -+ and check its contents at the appropriate moment. See post_wake_state below -+ for more details on how this entry is used. -+ -+ - log_everything (CONFIG_PM_DEBUG): -+ -+ Setting this option results in all messages printed being logged. Normally, -+ only a subset are logged, so as to not slow the process and not clutter the -+ logs. Useful for debugging. It can be toggled during a cycle by pressing -+ 'L'. -+ -+ - no_load_direct: -+ -+ This is a debugging option. If, when loading the atomically copied pages of -+ an image, TuxOnIce finds that the destination address for a page is free, -+ it will normally allocate the image, load the data directly into that -+ address and skip it in the atomic restore. If this option is disabled, the -+ page will be loaded somewhere else and atomically restored like other pages. -+ -+ - no_flusher_thread: -+ -+ When doing multithreaded I/O (see below), the first online CPU can be used -+ to _just_ submit compressed pages when writing the image, rather than -+ compressing and submitting data. This option is normally disabled, but has -+ been included because Nigel would like to see whether it will be more useful -+ as the number of cores/cpus in computers increases. -+ -+ - no_multithreaded_io: -+ -+ TuxOnIce will normally create one thread per cpu/core on your computer, -+ each of which will then perform I/O. This will generally result in -+ throughput that's the maximum the storage medium can handle. There -+ shouldn't be any reason to disable multithreaded I/O now, but this option -+ has been retained for debugging purposes. -+ -+ - no_pageset2 -+ -+ See the entry for full_pageset2 above for an explanation of pagesets. -+ Enabling this option causes TuxOnIce to do an atomic copy of all pages, -+ thereby limiting the maximum image size to 1/2 of memory, as swsusp does. -+ -+ - no_pageset2_if_unneeded -+ -+ See the entry for full_pageset2 above for an explanation of pagesets. -+ Enabling this option causes TuxOnIce to act like no_pageset2 was enabled -+ if and only it isn't needed anyway. This option may still make TuxOnIce -+ less reliable because pageset2 pages are normally used to store the -+ atomic copy - drivers that want to do allocations of larger amounts of -+ memory in one shot will be more likely to find that those amounts aren't -+ available if this option is enabled. -+ -+ - pause_between_steps (CONFIG_PM_DEBUG): -+ -+ This option is used during debugging, to make TuxOnIce pause between -+ each step of the process. It is ignored when the nice display is on. -+ -+ - post_wake_state: -+ -+ TuxOnIce provides support for automatically waking after a user-selected -+ delay, and using a different powerdown method if the lid is still closed. -+ (Yes, we're assuming a laptop). This entry lets you choose what state -+ should be entered next. The values are those described under -+ powerdown_method, below. It can be used to suspend to RAM after hibernating, -+ then powerdown properly (say) 20 minutes. It can also be used to power down -+ properly, then wake at (say) 6.30am and suspend to RAM until you're ready -+ to use the machine. -+ -+ - powerdown_method: -+ -+ Used to select a method by which TuxOnIce should powerdown after writing the -+ image. Currently: -+ -+ 0: Don't use ACPI to power off. -+ 3: Attempt to enter Suspend-to-ram. -+ 4: Attempt to enter ACPI S4 mode. -+ 5: Attempt to power down via ACPI S5 mode. -+ -+ Note that these options are highly dependant upon your hardware & software: -+ -+ 3: When succesful, your machine suspends to ram instead of powering off. -+ The advantage of using this mode is that it doesn't matter whether your -+ battery has enough charge to make it through to your next resume. If it -+ lasts, you will simply resume from suspend to ram (and the image on disk -+ will be discarded). If the battery runs out, you will resume from disk -+ instead. The disadvantage is that it takes longer than a normal -+ suspend-to-ram to enter the state, since the suspend-to-disk image needs -+ to be written first. -+ 4/5: When successful, your machine will be off and comsume (almost) no power. -+ But it might still react to some external events like opening the lid or -+ trafic on a network or usb device. For the bios, resume is then the same -+ as warm boot, similar to a situation where you used the command `reboot' -+ to reboot your machine. If your machine has problems on warm boot or if -+ you want to protect your machine with the bios password, this is probably -+ not the right choice. Mode 4 may be necessary on some machines where ACPI -+ wake up methods need to be run to properly reinitialise hardware after a -+ hibernation cycle. -+ 0: Switch the machine completely off. The only possible wakeup is the power -+ button. For the bios, resume is then the same as a cold boot, in -+ particular you would have to provide your bios boot password if your -+ machine uses that feature for booting. -+ -+ - progressbar_granularity_limit: -+ -+ This option can be used to limit the granularity of the progress bar -+ displayed with a bootsplash screen. The value is the maximum number of -+ steps. That is, 10 will make the progress bar jump in 10% increments. -+ -+ - reboot: -+ -+ This option causes TuxOnIce to reboot rather than powering down -+ at the end of saving an image. It can be toggled during a cycle by pressing -+ 'R'. -+ -+ - resume: -+ -+ This sysfs entry can be used to read and set the location in which TuxOnIce -+ will look for the signature of an image - the value set using resume= at -+ boot time or CONFIG_PM_STD_PARTITION ("Default resume partition"). By -+ writing to this file as well as modifying your bootloader's configuration -+ file (eg menu.lst), you can set or reset the location of your image or the -+ method of storing the image without rebooting. -+ -+ - replace_swsusp (CONFIG_TOI_REPLACE_SWSUSP): -+ -+ This option makes -+ -+ echo disk > /sys/power/state -+ -+ activate TuxOnIce instead of swsusp. Regardless of whether this option is -+ enabled, any invocation of swsusp's resume time trigger will cause TuxOnIce -+ to check for an image too. This is due to the fact that at resume time, we -+ can't know whether this option was enabled until we see if an image is there -+ for us to resume from. (And when an image exists, we don't care whether we -+ did replace swsusp anyway - we just want to resume). -+ -+ - resume_commandline: -+ -+ This entry can be read after resuming to see the commandline that was used -+ when resuming began. You might use this to set up two bootloader entries -+ that are the same apart from the fact that one includes a extra append= -+ argument "at_work=1". You could then grep resume_commandline in your -+ post-resume scripts and configure networking (for example) differently -+ depending upon whether you're at home or work. resume_commandline can be -+ set to arbitrary text if you wish to remove sensitive contents. -+ -+ - swap/swapfilename: -+ -+ This entry is used to specify the swapfile or partition that -+ TuxOnIce will attempt to swapon/swapoff automatically. Thus, if -+ I normally use /dev/hda1 for swap, and want to use /dev/hda2 for specifically -+ for my hibernation image, I would -+ -+ echo /dev/hda2 > /sys/power/tuxonice/swap/swapfile -+ -+ /dev/hda2 would then be automatically swapon'd and swapoff'd. Note that the -+ swapon and swapoff occur while other processes are frozen (including kswapd) -+ so this swap file will not be used up when attempting to free memory. The -+ parition/file is also given the highest priority, so other swapfiles/partitions -+ will only be used to save the image when this one is filled. -+ -+ The value of this file is used by headerlocations along with any currently -+ activated swapfiles/partitions. -+ -+ - swap/headerlocations: -+ -+ This option tells you the resume= options to use for swap devices you -+ currently have activated. It is particularly useful when you only want to -+ use a swap file to store your image. See above for further details. -+ -+ - test_bio -+ -+ This is a debugging option. When enabled, TuxOnIce will not hibernate. -+ Instead, when asked to write an image, it will skip the atomic copy, -+ just doing the writing of the image and then returning control to the -+ user at the point where it would have powered off. This is useful for -+ testing throughput in different configurations. -+ -+ - test_filter_speed -+ -+ This is a debugging option. When enabled, TuxOnIce will not hibernate. -+ Instead, when asked to write an image, it will not write anything or do -+ an atomic copy, but will only run any enabled compression algorithm on the -+ data that would have been written (the source pages of the atomic copy in -+ the case of pageset 1). This is useful for comparing the performance of -+ compression algorithms and for determining the extent to which an upgrade -+ to your storage method would improve hibernation speed. -+ -+ - user_interface/debug_sections (CONFIG_PM_DEBUG): -+ -+ This value, together with the console log level, controls what debugging -+ information is displayed. The console log level determines the level of -+ detail, and this value determines what detail is displayed. This value is -+ a bit vector, and the meaning of the bits can be found in the kernel tree -+ in include/linux/tuxonice.h. It can be overridden using the kernel's -+ command line option suspend_dbg. -+ -+ - user_interface/default_console_level (CONFIG_PM_DEBUG): -+ -+ This determines the value of the console log level at the start of a -+ hibernation cycle. If debugging is compiled in, the console log level can be -+ changed during a cycle by pressing the digit keys. Meanings are: -+ -+ 0: Nice display. -+ 1: Nice display plus numerical progress. -+ 2: Errors only. -+ 3: Low level debugging info. -+ 4: Medium level debugging info. -+ 5: High level debugging info. -+ 6: Verbose debugging info. -+ -+ - user_interface/enable_escape: -+ -+ Setting this to "1" will enable you abort a hibernation cycle or resuming by -+ pressing escape, "0" (default) disables this feature. Note that enabling -+ this option means that you cannot initiate a hibernation cycle and then walk -+ away from your computer, expecting it to be secure. With feature disabled, -+ you can validly have this expectation once TuxOnice begins to write the -+ image to disk. (Prior to this point, it is possible that TuxOnice might -+ about because of failure to freeze all processes or because constraints -+ on its ability to save the image are not met). -+ -+ - user_interface/program -+ -+ This entry is used to tell TuxOnice what userspace program to use for -+ providing a user interface while hibernating. The program uses a netlink -+ socket to pass messages back and forward to the kernel, allowing all of the -+ functions formerly implemented in the kernel user interface components. -+ -+ - version: -+ -+ The version of TuxOnIce you have compiled into the currently running kernel. -+ -+ - wake_alarm_dir: -+ -+ As mentioned above (post_wake_state), TuxOnIce supports automatically waking -+ after some delay. This entry allows you to select which wake alarm to use. -+ It should contain the value "rtc0" if you're wanting to use -+ /sys/class/rtc/rtc0. -+ -+ - wake_delay: -+ -+ This value determines the delay from the end of writing the image until the -+ wake alarm is triggered. You can set an absolute time by writing the desired -+ time into /sys/class/rtc//wakealarm and leaving these values -+ empty. -+ -+ Note that for the wakeup to actually occur, you may need to modify entries -+ in /proc/acpi/wakeup. This is done by echoing the name of the button in the -+ first column (eg PBTN) into the file. -+ -+7. How do you get support? -+ -+ Glad you asked. TuxOnIce is being actively maintained and supported -+ by Nigel (the guy doing most of the kernel coding at the moment), Bernard -+ (who maintains the hibernate script and userspace user interface components) -+ and its users. -+ -+ Resources availble include HowTos, FAQs and a Wiki, all available via -+ tuxonice.net. You can find the mailing lists there. -+ -+8. I think I've found a bug. What should I do? -+ -+ By far and a way, the most common problems people have with TuxOnIce -+ related to drivers not having adequate power management support. In this -+ case, it is not a bug with TuxOnIce, but we can still help you. As we -+ mentioned above, such issues can usually be worked around by building the -+ functionality as modules and unloading them while hibernating. Please visit -+ the Wiki for up-to-date lists of known issues and work arounds. -+ -+ If this information doesn't help, try running: -+ -+ hibernate --bug-report -+ -+ ..and sending the output to the users mailing list. -+ -+ Good information on how to provide us with useful information from an -+ oops is found in the file REPORTING-BUGS, in the top level directory -+ of the kernel tree. If you get an oops, please especially note the -+ information about running what is printed on the screen through ksymoops. -+ The raw information is useless. -+ -+9. When will XXX be supported? -+ -+ If there's a feature missing from TuxOnIce that you'd like, feel free to -+ ask. We try to be obliging, within reason. -+ -+ Patches are welcome. Please send to the list. -+ -+10. How does it work? -+ -+ TuxOnIce does its work in a number of steps. -+ -+ a. Freezing system activity. -+ -+ The first main stage in hibernating is to stop all other activity. This is -+ achieved in stages. Processes are considered in fours groups, which we will -+ describe in reverse order for clarity's sake: Threads with the PF_NOFREEZE -+ flag, kernel threads without this flag, userspace processes with the -+ PF_SYNCTHREAD flag and all other processes. The first set (PF_NOFREEZE) are -+ untouched by the refrigerator code. They are allowed to run during hibernating -+ and resuming, and are used to support user interaction, storage access or the -+ like. Other kernel threads (those unneeded while hibernating) are frozen last. -+ This leaves us with userspace processes that need to be frozen. When a -+ process enters one of the *_sync system calls, we set a PF_SYNCTHREAD flag on -+ that process for the duration of that call. Processes that have this flag are -+ frozen after processes without it, so that we can seek to ensure that dirty -+ data is synced to disk as quickly as possible in a situation where other -+ processes may be submitting writes at the same time. Freezing the processes -+ that are submitting data stops new I/O from being submitted. Syncthreads can -+ then cleanly finish their work. So the order is: -+ -+ - Userspace processes without PF_SYNCTHREAD or PF_NOFREEZE; -+ - Userspace processes with PF_SYNCTHREAD (they won't have NOFREEZE); -+ - Kernel processes without PF_NOFREEZE. -+ -+ b. Eating memory. -+ -+ For a successful hibernation cycle, you need to have enough disk space to store the -+ image and enough memory for the various limitations of TuxOnIce's -+ algorithm. You can also specify a maximum image size. In order to attain -+ to those constraints, TuxOnIce may 'eat' memory. If, after freezing -+ processes, the constraints aren't met, TuxOnIce will thaw all the -+ other processes and begin to eat memory until its calculations indicate -+ the constraints are met. It will then freeze processes again and recheck -+ its calculations. -+ -+ c. Allocation of storage. -+ -+ Next, TuxOnIce allocates the storage that will be used to save -+ the image. -+ -+ The core of TuxOnIce knows nothing about how or where pages are stored. We -+ therefore request the active allocator (remember you might have compiled in -+ more than one!) to allocate enough storage for our expect image size. If -+ this request cannot be fulfilled, we eat more memory and try again. If it -+ is fulfiled, we seek to allocate additional storage, just in case our -+ expected compression ratio (if any) isn't achieved. This time, however, we -+ just continue if we can't allocate enough storage. -+ -+ If these calls to our allocator change the characteristics of the image -+ such that we haven't allocated enough memory, we also loop. (The allocator -+ may well need to allocate space for its storage information). -+ -+ d. Write the first part of the image. -+ -+ TuxOnIce stores the image in two sets of pages called 'pagesets'. -+ Pageset 2 contains pages on the active and inactive lists; essentially -+ the page cache. Pageset 1 contains all other pages, including the kernel. -+ We use two pagesets for one important reason: We need to make an atomic copy -+ of the kernel to ensure consistency of the image. Without a second pageset, -+ that would limit us to an image that was at most half the amount of memory -+ available. Using two pagesets allows us to store a full image. Since pageset -+ 2 pages won't be needed in saving pageset 1, we first save pageset 2 pages. -+ We can then make our atomic copy of the remaining pages using both pageset 2 -+ pages and any other pages that are free. While saving both pagesets, we are -+ careful not to corrupt the image. Among other things, we use lowlevel block -+ I/O routines that don't change the pagecache contents. -+ -+ The next step, then, is writing pageset 2. -+ -+ e. Suspending drivers and storing processor context. -+ -+ Having written pageset2, TuxOnIce calls the power management functions to -+ notify drivers of the hibernation, and saves the processor state in preparation -+ for the atomic copy of memory we are about to make. -+ -+ f. Atomic copy. -+ -+ At this stage, everything else but the TuxOnIce code is halted. Processes -+ are frozen or idling, drivers are quiesced and have stored (ideally and where -+ necessary) their configuration in memory we are about to atomically copy. -+ In our lowlevel architecture specific code, we have saved the CPU state. -+ We can therefore now do our atomic copy before resuming drivers etc. -+ -+ g. Save the atomic copy (pageset 1). -+ -+ TuxOnice can then write the atomic copy of the remaining pages. Since we -+ have copied the pages into other locations, we can continue to use the -+ normal block I/O routines without fear of corruption our image. -+ -+ f. Save the image header. -+ -+ Nearly there! We save our settings and other parameters needed for -+ reloading pageset 1 in an 'image header'. We also tell our allocator to -+ serialise its data at this stage, so that it can reread the image at resume -+ time. -+ -+ g. Set the image header. -+ -+ Finally, we edit the header at our resume= location. The signature is -+ changed by the allocator to reflect the fact that an image exists, and to -+ point to the start of that data if necessary (swap allocator). -+ -+ h. Power down. -+ -+ Or reboot if we're debugging and the appropriate option is selected. -+ -+ Whew! -+ -+ Reloading the image. -+ -------------------- -+ -+ Reloading the image is essentially the reverse of all the above. We load -+ our copy of pageset 1, being careful to choose locations that aren't going -+ to be overwritten as we copy it back (We start very early in the boot -+ process, so there are no other processes to quiesce here). We then copy -+ pageset 1 back to its original location in memory and restore the process -+ context. We are now running with the original kernel. Next, we reload the -+ pageset 2 pages, free the memory and swap used by TuxOnIce, restore -+ the pageset header and restart processes. Sounds easy in comparison to -+ hibernating, doesn't it! -+ -+ There is of course more to TuxOnIce than this, but this explanation -+ should be a good start. If there's interest, I'll write further -+ documentation on range pages and the low level I/O. -+ -+11. Who wrote TuxOnIce? -+ -+ (Answer based on the writings of Florent Chabaud, credits in files and -+ Nigel's limited knowledge; apologies to anyone missed out!) -+ -+ The main developers of TuxOnIce have been... -+ -+ Gabor Kuti -+ Pavel Machek -+ Florent Chabaud -+ Bernard Blackham -+ Nigel Cunningham -+ -+ Significant portions of swsusp, the code in the vanilla kernel which -+ TuxOnIce enhances, have been worked on by Rafael Wysocki. Thanks should -+ also be expressed to him. -+ -+ The above mentioned developers have been aided in their efforts by a host -+ of hundreds, if not thousands of testers and people who have submitted bug -+ fixes & suggestions. Of special note are the efforts of Michael Frank, who -+ had his computers repetitively hibernate and resume for literally tens of -+ thousands of cycles and developed scripts to stress the system and test -+ TuxOnIce far beyond the point most of us (Nigel included!) would consider -+ testing. His efforts have contributed as much to TuxOnIce as any of the -+ names above. -diff --git a/MAINTAINERS b/MAINTAINERS -index 2533fc4..e14223f 100644 ---- a/MAINTAINERS -+++ b/MAINTAINERS -@@ -5380,6 +5380,13 @@ S: Maintained - F: drivers/tc/ - F: include/linux/tc.h - -+TUXONICE (ENHANCED HIBERNATION) -+P: Nigel Cunningham -+M: nigel@tuxonice.net -+L: tuxonice-devel@tuxonice.net -+W: http://tuxonice.net -+S: Maintained -+ - U14-34F SCSI DRIVER - M: Dario Ballabio - L: linux-scsi@vger.kernel.org -diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c -index 573b3bd..073736f 100644 ---- a/arch/powerpc/mm/pgtable_32.c -+++ b/arch/powerpc/mm/pgtable_32.c -@@ -422,6 +422,7 @@ void kernel_map_pages(struct page *page, int numpages, int enable) - - change_page_attr(page, numpages, enable ? PAGE_KERNEL : __pgprot(0)); - } -+EXPORT_SYMBOL_GPL(kernel_map_pages); - #endif /* CONFIG_DEBUG_PAGEALLOC */ - - static int fixmaps; -diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c -index 704bddc..acdf978 100644 ---- a/arch/x86/kernel/reboot.c -+++ b/arch/x86/kernel/reboot.c -@@ -710,6 +710,7 @@ void machine_restart(char *cmd) - { - machine_ops.restart(cmd); - } -+EXPORT_SYMBOL_GPL(machine_restart); - - void machine_halt(void) - { -diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c -index 1d4eb93..035b423 100644 ---- a/arch/x86/mm/pageattr.c -+++ b/arch/x86/mm/pageattr.c -@@ -1296,6 +1296,7 @@ void kernel_map_pages(struct page *page, int numpages, int enable) - */ - __flush_tlb_all(); - } -+EXPORT_SYMBOL_GPL(kernel_map_pages); - - #ifdef CONFIG_HIBERNATION - -@@ -1310,7 +1311,7 @@ bool kernel_page_present(struct page *page) - pte = lookup_address((unsigned long)page_address(page), &level); - return (pte_val(*pte) & _PAGE_PRESENT); - } -- -+EXPORT_SYMBOL_GPL(kernel_page_present); - #endif /* CONFIG_HIBERNATION */ - - #endif /* CONFIG_DEBUG_PAGEALLOC */ -diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c -index 0a979f3..7cdace5 100644 ---- a/arch/x86/power/cpu.c -+++ b/arch/x86/power/cpu.c -@@ -112,9 +112,7 @@ void save_processor_state(void) - { - __save_processor_state(&saved_context); - } --#ifdef CONFIG_X86_32 - EXPORT_SYMBOL(save_processor_state); --#endif - - static void do_fpu_end(void) - { -diff --git a/arch/x86/power/hibernate_32.c b/arch/x86/power/hibernate_32.c -index 81197c6..ff7e534 100644 ---- a/arch/x86/power/hibernate_32.c -+++ b/arch/x86/power/hibernate_32.c -@@ -8,6 +8,7 @@ - - #include - #include -+#include - - #include - #include -@@ -163,6 +164,7 @@ int swsusp_arch_resume(void) - restore_image(); - return 0; - } -+EXPORT_SYMBOL_GPL(swsusp_arch_resume); - - /* - * pfn_is_nosave - check if given pfn is in the 'nosave' section -diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c -index 65fdc86..e5c31f6 100644 ---- a/arch/x86/power/hibernate_64.c -+++ b/arch/x86/power/hibernate_64.c -@@ -10,6 +10,7 @@ - - #include - #include -+#include - #include - #include - #include -@@ -118,6 +119,7 @@ int swsusp_arch_resume(void) - restore_image(); - return 0; - } -+EXPORT_SYMBOL_GPL(swsusp_arch_resume); - - /* - * pfn_is_nosave - check if given pfn is in the 'nosave' section -@@ -168,3 +170,4 @@ int arch_hibernation_header_restore(void *addr) - restore_cr3 = rdr->cr3; - return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL; - } -+EXPORT_SYMBOL_GPL(arch_hibernation_header_restore); -diff --git a/block/Makefile b/block/Makefile -index cb2d515..f35a848 100644 ---- a/block/Makefile -+++ b/block/Makefile -@@ -5,7 +5,7 @@ - obj-$(CONFIG_BLOCK) := elevator.o blk-core.o blk-tag.o blk-sysfs.o \ - blk-barrier.o blk-settings.o blk-ioc.o blk-map.o \ - blk-exec.o blk-merge.o blk-softirq.o blk-timeout.o \ -- blk-iopoll.o ioctl.o genhd.o scsi_ioctl.o -+ blk-iopoll.o ioctl.o genhd.o scsi_ioctl.o uuid.o - - obj-$(CONFIG_BLK_DEV_BSG) += bsg.o - obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o -diff --git a/block/blk-core.c b/block/blk-core.c -index d1a9a0a..d229a5b 100644 ---- a/block/blk-core.c -+++ b/block/blk-core.c -@@ -37,6 +37,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(block_remap); - EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap); - EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete); - -+int trap_non_toi_io; -+EXPORT_SYMBOL_GPL(trap_non_toi_io); -+ - static int __make_request(struct request_queue *q, struct bio *bio); - - /* -@@ -1555,6 +1558,9 @@ void submit_bio(int rw, struct bio *bio) - - bio->bi_rw |= rw; - -+ if (unlikely(trap_non_toi_io)) -+ BUG_ON(!bio_rw_flagged(bio, BIO_RW_TUXONICE)); -+ - /* - * If it's a regular read/write or a barrier with data attached, - * go through the normal accounting stuff before submission. -diff --git a/block/genhd.c b/block/genhd.c -index d13ba76..a69521c 100644 ---- a/block/genhd.c -+++ b/block/genhd.c -@@ -18,6 +18,8 @@ - #include - #include - #include -+#include -+#include - - #include "blk.h" - -@@ -1286,3 +1288,82 @@ int invalidate_partition(struct gendisk *disk, int partno) - } - - EXPORT_SYMBOL(invalidate_partition); -+ -+dev_t blk_lookup_uuid(const char *uuid) -+{ -+ dev_t devt = MKDEV(0, 0); -+ struct class_dev_iter iter; -+ struct device *dev; -+ -+ class_dev_iter_init(&iter, &block_class, NULL, &disk_type); -+ while (!devt && (dev = class_dev_iter_next(&iter))) { -+ struct gendisk *disk = dev_to_disk(dev); -+ struct disk_part_iter piter; -+ struct hd_struct *part; -+ -+ disk_part_iter_init(&piter, disk, DISK_PITER_INCL_PART0); -+ -+ while ((part = disk_part_iter_next(&piter))) { -+ if (part_matches_uuid(part, uuid)) { -+ devt = part_devt(part); -+ break; -+ } -+ } -+ disk_part_iter_exit(&piter); -+ } -+ class_dev_iter_exit(&iter); -+ return devt; -+} -+EXPORT_SYMBOL_GPL(blk_lookup_uuid); -+ -+/* Caller uses NULL, key to start. For each match found, we return a bdev on -+ * which we have done blkdev_get, and we do the blkdev_put on block devices -+ * that are passed to us. When no more matches are found, we return NULL. -+ */ -+struct block_device *next_bdev_of_type(struct block_device *last, -+ const char *key) -+{ -+ dev_t devt = MKDEV(0, 0); -+ struct class_dev_iter iter; -+ struct device *dev; -+ struct block_device *next = NULL, *bdev; -+ int got_last = 0; -+ -+ if (!key) -+ goto out; -+ -+ class_dev_iter_init(&iter, &block_class, NULL, &disk_type); -+ while (!devt && (dev = class_dev_iter_next(&iter))) { -+ struct gendisk *disk = dev_to_disk(dev); -+ struct disk_part_iter piter; -+ struct hd_struct *part; -+ -+ disk_part_iter_init(&piter, disk, DISK_PITER_INCL_PART0); -+ -+ while ((part = disk_part_iter_next(&piter))) { -+ bdev = bdget(part_devt(part)); -+ if (last && !got_last) { -+ if (last == bdev) -+ got_last = 1; -+ continue; -+ } -+ -+ if (blkdev_get(bdev, FMODE_READ)) -+ continue; -+ -+ if (bdev_matches_key(bdev, key)) { -+ next = bdev; -+ break; -+ } -+ -+ blkdev_put(bdev, FMODE_READ); -+ } -+ disk_part_iter_exit(&piter); -+ } -+ class_dev_iter_exit(&iter); -+out: -+ if (last) -+ blkdev_put(last, FMODE_READ); -+ return next; -+} -+EXPORT_SYMBOL_GPL(next_bdev_of_type); -diff --git a/block/uuid.c b/block/uuid.c -new file mode 100644 -index 0000000..3862685 ---- /dev/null -+++ b/block/uuid.c -@@ -0,0 +1,528 @@ -+#include -+#include -+#include -+ -+static int debug_enabled; -+ -+#define PRINTK(fmt, args...) do { \ -+ if (debug_enabled) \ -+ printk(KERN_DEBUG fmt, ## args); \ -+ } while(0) -+ -+#define PRINT_HEX_DUMP(v1, v2, v3, v4, v5, v6, v7, v8) \ -+ do { \ -+ if (debug_enabled) \ -+ print_hex_dump(v1, v2, v3, v4, v5, v6, v7, v8); \ -+ } while(0) -+ -+/* -+ * Simple UUID translation -+ */ -+ -+struct uuid_info { -+ const char *key; -+ const char *name; -+ long bkoff; -+ unsigned sboff; -+ unsigned sig_len; -+ const char *magic; -+ int uuid_offset; -+ int last_mount_offset; -+ int last_mount_size; -+}; -+ -+/* -+ * Based on libuuid's blkid_magic array. Note that I don't -+ * have uuid offsets for all of these yet - mssing ones are 0x0. -+ * Further information welcome. -+ * -+ * Rearranged by page of fs signature for optimisation. -+ */ -+static struct uuid_info uuid_list[] = { -+ { NULL, "oracleasm", 0, 32, 8, "ORCLDISK", 0x0, 0, 0 }, -+ { "ntfs", "ntfs", 0, 3, 8, "NTFS ", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0x52, 5, "MSWIN", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0x52, 8, "FAT32 ", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0x36, 5, "MSDOS", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0x36, 8, "FAT16 ", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0x36, 8, "FAT12 ", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0, 1, "\353", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0, 1, "\351", 0x0, 0, 0 }, -+ { "vfat", "vfat", 0, 0x1fe, 2, "\125\252", 0x0, 0, 0 }, -+ { "xfs", "xfs", 0, 0, 4, "XFSB", 0x14, 0, 0 }, -+ { "romfs", "romfs", 0, 0, 8, "-rom1fs-", 0x0, 0, 0 }, -+ { "bfs", "bfs", 0, 0, 4, "\316\372\173\033", 0, 0, 0 }, -+ { "cramfs", "cramfs", 0, 0, 4, "E=\315\050", 0x0, 0, 0 }, -+ { "qnx4", "qnx4", 0, 4, 6, "QNX4FS", 0, 0, 0 }, -+ { NULL, "crypt_LUKS", 0, 0, 6, "LUKS\xba\xbe", 0x0, 0, 0 }, -+ { "squashfs", "squashfs", 0, 0, 4, "sqsh", 0, 0, 0 }, -+ { "squashfs", "squashfs", 0, 0, 4, "hsqs", 0, 0, 0 }, -+ { "ocfs", "ocfs", 0, 8, 9, "OracleCFS", 0x0, 0, 0 }, -+ { "lvm2pv", "lvm2pv", 0, 0x018, 8, "LVM2 001", 0x0, 0, 0 }, -+ { "sysv", "sysv", 0, 0x3f8, 4, "\020~\030\375", 0, 0, 0 }, -+ { "ext", "ext", 1, 0x38, 2, "\123\357", 0x468, 0x42c, 4 }, -+ { "minix", "minix", 1, 0x10, 2, "\177\023", 0, 0, 0 }, -+ { "minix", "minix", 1, 0x10, 2, "\217\023", 0, 0, 0 }, -+ { "minix", "minix", 1, 0x10, 2, "\150\044", 0, 0, 0 }, -+ { "minix", "minix", 1, 0x10, 2, "\170\044", 0, 0, 0 }, -+ { "lvm2pv", "lvm2pv", 1, 0x018, 8, "LVM2 001", 0x0, 0, 0 }, -+ { "vxfs", "vxfs", 1, 0, 4, "\365\374\001\245", 0, 0, 0 }, -+ { "hfsplus", "hfsplus", 1, 0, 2, "BD", 0x0, 0, 0 }, -+ { "hfsplus", "hfsplus", 1, 0, 2, "H+", 0x0, 0, 0 }, -+ { "hfsplus", "hfsplus", 1, 0, 2, "HX", 0x0, 0, 0 }, -+ { "hfs", "hfs", 1, 0, 2, "BD", 0x0, 0, 0 }, -+ { "ocfs2", "ocfs2", 1, 0, 6, "OCFSV2", 0x0, 0, 0 }, -+ { "lvm2pv", "lvm2pv", 0, 0x218, 8, "LVM2 001", 0x0, 0, 0 }, -+ { "lvm2pv", "lvm2pv", 1, 0x218, 8, "LVM2 001", 0x0, 0, 0 }, -+ { "ocfs2", "ocfs2", 2, 0, 6, "OCFSV2", 0x0, 0, 0 }, -+ { "swap", "swap", 0, 0xff6, 10, "SWAP-SPACE", 0x40c, 0, 0 }, -+ { "swap", "swap", 0, 0xff6, 10, "SWAPSPACE2", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0xff6, 9, "S1SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0xff6, 9, "S2SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0xff6, 9, "ULSUSPEND", 0x40c, 0, 0 }, -+ { "ocfs2", "ocfs2", 4, 0, 6, "OCFSV2", 0x0, 0, 0 }, -+ { "ocfs2", "ocfs2", 8, 0, 6, "OCFSV2", 0x0, 0, 0 }, -+ { "hpfs", "hpfs", 8, 0, 4, "I\350\225\371", 0, 0, 0 }, -+ { "reiserfs", "reiserfs", 8, 0x34, 8, "ReIsErFs", 0x10054, 0, 0 }, -+ { "reiserfs", "reiserfs", 8, 20, 8, "ReIsErFs", 0x10054, 0, 0 }, -+ { "zfs", "zfs", 8, 0, 8, "\0\0\x02\xf5\xb0\x07\xb1\x0c", 0x0, 0, 0 }, -+ { "zfs", "zfs", 8, 0, 8, "\x0c\xb1\x07\xb0\xf5\x02\0\0", 0x0, 0, 0 }, -+ { "ufs", "ufs", 8, 0x55c, 4, "T\031\001\000", 0, 0, 0 }, -+ { "swap", "swap", 0, 0x1ff6, 10, "SWAP-SPACE", 0x40c, 0, 0 }, -+ { "swap", "swap", 0, 0x1ff6, 10, "SWAPSPACE2", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x1ff6, 9, "S1SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x1ff6, 9, "S2SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x1ff6, 9, "ULSUSPEND", 0x40c, 0, 0 }, -+ { "reiserfs", "reiserfs", 64, 0x34, 9, "ReIsEr2Fs", 0x10054, 0, 0 }, -+ { "reiserfs", "reiserfs", 64, 0x34, 9, "ReIsEr3Fs", 0x10054, 0, 0 }, -+ { "reiserfs", "reiserfs", 64, 0x34, 8, "ReIsErFs", 0x10054, 0, 0 }, -+ { "reiser4", "reiser4", 64, 0, 7, "ReIsEr4", 0x100544, 0, 0 }, -+ { "gfs2", "gfs2", 64, 0, 4, "\x01\x16\x19\x70", 0x0, 0, 0 }, -+ { "gfs", "gfs", 64, 0, 4, "\x01\x16\x19\x70", 0x0, 0, 0 }, -+ { "btrfs", "btrfs", 64, 0x40, 8, "_BHRfS_M", 0x0, 0, 0 }, -+ { "swap", "swap", 0, 0x3ff6, 10, "SWAP-SPACE", 0x40c, 0, 0 }, -+ { "swap", "swap", 0, 0x3ff6, 10, "SWAPSPACE2", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x3ff6, 9, "S1SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x3ff6, 9, "S2SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x3ff6, 9, "ULSUSPEND", 0x40c, 0, 0 }, -+ { "udf", "udf", 32, 1, 5, "BEA01", 0x0, 0, 0 }, -+ { "udf", "udf", 32, 1, 5, "BOOT2", 0x0, 0, 0 }, -+ { "udf", "udf", 32, 1, 5, "CD001", 0x0, 0, 0 }, -+ { "udf", "udf", 32, 1, 5, "CDW02", 0x0, 0, 0 }, -+ { "udf", "udf", 32, 1, 5, "NSR02", 0x0, 0, 0 }, -+ { "udf", "udf", 32, 1, 5, "NSR03", 0x0, 0, 0 }, -+ { "udf", "udf", 32, 1, 5, "TEA01", 0x0, 0, 0 }, -+ { "iso9660", "iso9660", 32, 1, 5, "CD001", 0x0, 0, 0 }, -+ { "iso9660", "iso9660", 32, 9, 5, "CDROM", 0x0, 0, 0 }, -+ { "jfs", "jfs", 32, 0, 4, "JFS1", 0x88, 0, 0 }, -+ { "swap", "swap", 0, 0x7ff6, 10, "SWAP-SPACE", 0x40c, 0, 0 }, -+ { "swap", "swap", 0, 0x7ff6, 10, "SWAPSPACE2", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x7ff6, 9, "S1SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x7ff6, 9, "S2SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0x7ff6, 9, "ULSUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swap", 0, 0xfff6, 10, "SWAP-SPACE", 0x40c, 0, 0 }, -+ { "swap", "swap", 0, 0xfff6, 10, "SWAPSPACE2", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0xfff6, 9, "S1SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0xfff6, 9, "S2SUSPEND", 0x40c, 0, 0 }, -+ { "swap", "swsuspend", 0, 0xfff6, 9, "ULSUSPEND", 0x40c, 0, 0 }, -+ { "zfs", "zfs", 264, 0, 8, "\0\0\x02\xf5\xb0\x07\xb1\x0c", 0x0, 0, 0 }, -+ { "zfs", "zfs", 264, 0, 8, "\x0c\xb1\x07\xb0\xf5\x02\0\0", 0x0, 0, 0 }, -+ { NULL, NULL, 0, 0, 0, NULL, 0x0, 0, 0 } -+}; -+ -+static int null_uuid(const char *uuid) -+{ -+ int i; -+ -+ for (i = 0; i < 16 && !uuid[i]; i++); -+ -+ return (i == 16); -+} -+ -+ -+static void uuid_end_bio(struct bio *bio, int err) -+{ -+ struct page *page = bio->bi_io_vec[0].bv_page; -+ -+ BUG_ON(!test_bit(BIO_UPTODATE, &bio->bi_flags)); -+ -+ unlock_page(page); -+ bio_put(bio); -+} -+ -+ -+/** -+ * submit - submit BIO request -+ * @dev: The block device we're using. -+ * @page_num: The page we're reading. -+ * -+ * Based on Patrick Mochell's pmdisk code from long ago: "Straight from the -+ * textbook - allocate and initialize the bio. If we're writing, make sure -+ * the page is marked as dirty. Then submit it and carry on." -+ **/ -+static struct page *read_bdev_page(struct block_device *dev, int page_num) -+{ -+ struct bio *bio = NULL; -+ struct page *page = alloc_page(GFP_NOFS); -+ -+ if (!page) { -+ printk(KERN_ERR "Failed to allocate a page for reading data " -+ "in UUID checks."); -+ return NULL; -+ } -+ -+ bio = bio_alloc(GFP_NOFS, 1); -+ bio->bi_bdev = dev; -+ bio->bi_sector = page_num << 3; -+ bio->bi_end_io = uuid_end_bio; -+ -+ PRINTK("Submitting bio on device %lx, page %d.\n", -+ (unsigned long) dev->bd_dev, page_num); -+ -+ if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { -+ printk(KERN_DEBUG "ERROR: adding page to bio at %d\n", -+ page_num); -+ bio_put(bio); -+ __free_page(page); -+ printk(KERN_DEBUG "read_bdev_page freed page %p (in error " -+ "path).\n", page); -+ return ERR_PTR(-EFAULT); -+ } -+ -+ lock_page(page); -+ submit_bio(READ | (1 << BIO_RW_SYNCIO) | -+ (1 << BIO_RW_UNPLUG), bio); -+ -+ wait_on_page_locked(page); -+ return page; -+} -+ -+int bdev_matches_key(struct block_device *bdev, const char *key) -+{ -+ unsigned char *data = NULL; -+ struct page *data_page = NULL; -+ -+ int dev_offset, pg_num, pg_off, i; -+ int last_pg_num = -1; -+ int result = 0; -+ char buf[50]; -+ -+ if (null_uuid(key)) { -+ PRINTK("Refusing to find a NULL key.\n"); -+ return 0; -+ } -+ -+ if (!bdev->bd_disk) { -+ bdevname(bdev, buf); -+ PRINTK("bdev %s has no bd_disk.\n", buf); -+ return 0; -+ } -+ -+ if (!bdev->bd_disk->queue) { -+ bdevname(bdev, buf); -+ PRINTK("bdev %s has no queue.\n", buf); -+ return 0; -+ } -+ -+ for (i = 0; uuid_list[i].name; i++) { -+ struct uuid_info *dat = &uuid_list[i]; -+ -+ if (!dat->key || strcmp(dat->key, key)) -+ continue; -+ -+ dev_offset = (dat->bkoff << 10) + dat->sboff; -+ pg_num = dev_offset >> 12; -+ pg_off = dev_offset & 0xfff; -+ -+ if ((((pg_num + 1) << 3) - 1) > bdev->bd_part->nr_sects >> 1) -+ continue; -+ -+ if (pg_num != last_pg_num) { -+ if (data_page) -+ __free_page(data_page); -+ data_page = read_bdev_page(bdev, pg_num); -+ if (!data_page) { -+ result = -ENOMEM; -+ break; -+ } -+ data = page_address(data_page); -+ } -+ -+ last_pg_num = pg_num; -+ -+ if (strncmp(&data[pg_off], dat->magic, dat->sig_len)) -+ continue; -+ -+ result = 1; -+ break; -+ } -+ -+ if (data_page) -+ __free_page(data_page); -+ -+ return result; -+} -+ -+int part_matches_uuid(struct hd_struct *part, const char *uuid) -+{ -+ struct block_device *bdev; -+ unsigned char *data = NULL; -+ struct page *data_page = NULL; -+ -+ int dev_offset, pg_num, pg_off; -+ int uuid_pg_num, uuid_pg_off, i; -+ unsigned char *uuid_data = NULL; -+ struct page *uuid_data_page = NULL; -+ -+ int last_pg_num = -1, last_uuid_pg_num = 0; -+ int result = 0; -+ char buf[50]; -+ -+ if (null_uuid(uuid)) { -+ PRINTK("Refusing to find a NULL uuid.\n"); -+ return 0; -+ } -+ -+ bdev = bdget(part_devt(part)); -+ -+ PRINTK("blkdev_get %p.\n", part); -+ -+ if (blkdev_get(bdev, FMODE_READ)) { -+ PRINTK("blkdev_get failed.\n"); -+ return 0; -+ } -+ -+ if (!bdev->bd_disk) { -+ bdevname(bdev, buf); -+ PRINTK("bdev %s has no bd_disk.\n", buf); -+ goto out; -+ } -+ -+ if (!bdev->bd_disk->queue) { -+ bdevname(bdev, buf); -+ PRINTK("bdev %s has no queue.\n", buf); -+ goto out; -+ } -+ -+ PRINT_HEX_DUMP(KERN_EMERG, "part_matches_uuid looking for ", -+ DUMP_PREFIX_NONE, 16, 1, uuid, 16, 0); -+ -+ for (i = 0; uuid_list[i].name; i++) { -+ struct uuid_info *dat = &uuid_list[i]; -+ dev_offset = (dat->bkoff << 10) + dat->sboff; -+ pg_num = dev_offset >> 12; -+ pg_off = dev_offset & 0xfff; -+ uuid_pg_num = dat->uuid_offset >> 12; -+ uuid_pg_off = dat->uuid_offset & 0xfff; -+ -+ if ((((pg_num + 1) << 3) - 1) > part->nr_sects >> 1) -+ continue; -+ -+ /* Ignore partition types with no UUID offset */ -+ if (!dat->uuid_offset) -+ continue; -+ -+ if (pg_num != last_pg_num) { -+ if (data_page) -+ __free_page(data_page); -+ data_page = read_bdev_page(bdev, pg_num); -+ if (!data_page) { -+ result = -ENOMEM; -+ break; -+ } -+ data = page_address(data_page); -+ } -+ -+ last_pg_num = pg_num; -+ -+ if (strncmp(&data[pg_off], dat->magic, dat->sig_len)) -+ continue; -+ -+ /* Does the UUID match? */ -+ if (uuid_pg_num > part->nr_sects >> 3) -+ continue; -+ -+ if (!uuid_data || uuid_pg_num != last_uuid_pg_num) { -+ if (uuid_data_page) -+ __free_page(uuid_data_page); -+ uuid_data_page = read_bdev_page(bdev, uuid_pg_num); -+ if (!uuid_data_page) { -+ result = -ENOMEM; -+ break; -+ } -+ uuid_data = page_address(uuid_data_page); -+ } -+ -+ last_uuid_pg_num = uuid_pg_num; -+ -+ PRINT_HEX_DUMP(KERN_EMERG, "part_matches_uuid considering ", -+ DUMP_PREFIX_NONE, 16, 1, -+ &uuid_data[uuid_pg_off], 16, 0); -+ -+ if (!memcmp(&uuid_data[uuid_pg_off], uuid, 16)) { -+ PRINTK("We have a match.\n"); -+ result = 1; -+ break; -+ } -+ } -+ -+ if (data_page) -+ __free_page(data_page); -+ -+ if (uuid_data_page) -+ __free_page(uuid_data_page); -+ -+out: -+ blkdev_put(bdev, FMODE_READ); -+ return result; -+} -+ -+void free_fs_info(struct fs_info *fs_info) -+{ -+ if (!fs_info || IS_ERR(fs_info)) -+ return; -+ -+ if (fs_info->last_mount) -+ kfree(fs_info->last_mount); -+ -+ kfree(fs_info); -+} -+EXPORT_SYMBOL_GPL(free_fs_info); -+ -+struct fs_info *fs_info_from_block_dev(struct block_device *bdev) -+{ -+ unsigned char *data = NULL; -+ struct page *data_page = NULL; -+ -+ int dev_offset, pg_num, pg_off; -+ int uuid_pg_num, uuid_pg_off, i; -+ unsigned char *uuid_data = NULL; -+ struct page *uuid_data_page = NULL; -+ -+ int last_pg_num = -1, last_uuid_pg_num = 0; -+ char buf[50]; -+ struct fs_info *fs_info = NULL; -+ -+ bdevname(bdev, buf); -+ -+ PRINTK(KERN_EMERG "uuid_from_block_dev looking for partition type " -+ "of %s.\n", buf); -+ -+ for (i = 0; uuid_list[i].name; i++) { -+ struct uuid_info *dat = &uuid_list[i]; -+ dev_offset = (dat->bkoff << 10) + dat->sboff; -+ pg_num = dev_offset >> 12; -+ pg_off = dev_offset & 0xfff; -+ uuid_pg_num = dat->uuid_offset >> 12; -+ uuid_pg_off = dat->uuid_offset & 0xfff; -+ -+ if ((((pg_num + 1) << 3) - 1) > bdev->bd_part->nr_sects >> 1) -+ continue; -+ -+ /* Ignore partition types with no UUID offset */ -+ if (!dat->uuid_offset) -+ continue; -+ -+ if (pg_num != last_pg_num) { -+ if (data_page) -+ __free_page(data_page); -+ data_page = read_bdev_page(bdev, pg_num); -+ if (!data_page) { -+ fs_info = ERR_PTR(-ENOMEM); -+ break; -+ } -+ data = page_address(data_page); -+ } -+ -+ last_pg_num = pg_num; -+ -+ if (strncmp(&data[pg_off], dat->magic, dat->sig_len)) -+ continue; -+ -+ PRINTK("This partition looks like %s.\n", dat->name); -+ -+ fs_info = kzalloc(sizeof(struct fs_info), GFP_KERNEL); -+ -+ if (!fs_info) { -+ PRINTK("Failed to allocate fs_info struct."); -+ fs_info = ERR_PTR(-ENOMEM); -+ break; -+ } -+ -+ /* UUID can't be off the end of the disk */ -+ if ((uuid_pg_num > bdev->bd_part->nr_sects >> 3) || -+ !dat->uuid_offset) -+ goto no_uuid; -+ -+ if (!uuid_data || uuid_pg_num != last_uuid_pg_num) { -+ if (uuid_data_page) -+ __free_page(uuid_data_page); -+ uuid_data_page = read_bdev_page(bdev, uuid_pg_num); -+ if (!uuid_data_page) { -+ fs_info = ERR_PTR(-ENOMEM); -+ break; -+ } -+ uuid_data = page_address(uuid_data_page); -+ } -+ -+ last_uuid_pg_num = uuid_pg_num; -+ memcpy(&fs_info->uuid, &uuid_data[uuid_pg_off], 16); -+ -+no_uuid: -+ PRINT_HEX_DUMP(KERN_EMERG, "fs_info_from_block_dev " -+ "returning uuid ", DUMP_PREFIX_NONE, 16, 1, -+ fs_info->uuid, 16, 0); -+ -+ if (dat->last_mount_size) { -+ int pg = dat->last_mount_offset >> 12, sz; -+ int off = dat->last_mount_offset & 0xfff; -+ struct page *last_mount = read_bdev_page(bdev, pg); -+ unsigned char *last_mount_data; -+ char *ptr; -+ -+ if (!last_mount) { -+ fs_info = ERR_PTR(-ENOMEM); -+ break; -+ } -+ last_mount_data = page_address(last_mount); -+ sz = dat->last_mount_size; -+ ptr = kmalloc(sz, GFP_KERNEL); -+ -+ if (!ptr) { -+ printk(KERN_EMERG "fs_info_from_block_dev " -+ "failed to get memory for last mount " -+ "timestamp."); -+ free_fs_info(fs_info); -+ fs_info = ERR_PTR(-ENOMEM); -+ } else { -+ fs_info->last_mount = ptr; -+ fs_info->last_mount_size = sz; -+ memcpy(ptr, &last_mount_data[off], sz); -+ } -+ -+ __free_page(last_mount); -+ } -+ break; -+ } -+ -+ if (data_page) -+ __free_page(data_page); -+ -+ if (uuid_data_page) -+ __free_page(uuid_data_page); -+ -+ return fs_info; -+} -+EXPORT_SYMBOL_GPL(fs_info_from_block_dev); -+ -+static int __init uuid_debug_setup(char *str) -+{ -+ int value; -+ -+ if (sscanf(str, "=%d", &value)) -+ debug_enabled = value; -+ -+ return 1; -+} -+ -+__setup("uuid_debug", uuid_debug_setup); -diff --git a/crypto/Kconfig b/crypto/Kconfig -index 81c185a..94cb5e8 100644 ---- a/crypto/Kconfig -+++ b/crypto/Kconfig -@@ -806,6 +806,13 @@ config CRYPTO_LZO - help - This is the LZO algorithm. - -+config CRYPTO_LZF -+ tristate "LZF compression algorithm" -+ select CRYPTO_ALGAPI -+ help -+ This is the LZF algorithm. It is especially useful for TuxOnIce, -+ because it achieves good compression quickly. -+ - comment "Random Number Generation" - - config CRYPTO_ANSI_CPRNG -diff --git a/crypto/Makefile b/crypto/Makefile -index 9e8f619..a06b213 100644 ---- a/crypto/Makefile -+++ b/crypto/Makefile -@@ -77,6 +77,7 @@ obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o - obj-$(CONFIG_CRYPTO_ZLIB) += zlib.o - obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o - obj-$(CONFIG_CRYPTO_CRC32C) += crc32c.o -+obj-$(CONFIG_CRYPTO_LZF) += lzf.o - obj-$(CONFIG_CRYPTO_AUTHENC) += authenc.o - obj-$(CONFIG_CRYPTO_LZO) += lzo.o - obj-$(CONFIG_CRYPTO_RNG2) += rng.o -diff --git a/crypto/lzf.c b/crypto/lzf.c -new file mode 100644 -index 0000000..ccaf83a ---- /dev/null -+++ b/crypto/lzf.c -@@ -0,0 +1,326 @@ -+/* -+ * Cryptoapi LZF compression module. -+ * -+ * Copyright (c) 2004-2008 Nigel Cunningham -+ * -+ * based on the deflate.c file: -+ * -+ * Copyright (c) 2003 James Morris -+ * -+ * and upon the LZF compression module donated to the TuxOnIce project with -+ * the following copyright: -+ * -+ * This program is free software; you can redistribute it and/or modify it -+ * under the terms of the GNU General Public License as published by the Free -+ * Software Foundation; either version 2 of the License, or (at your option) -+ * any later version. -+ * Copyright (c) 2000-2003 Marc Alexander Lehmann -+ * -+ * Redistribution and use in source and binary forms, with or without modifica- -+ * tion, are permitted provided that the following conditions are met: -+ * -+ * 1. Redistributions of source code must retain the above copyright notice, -+ * this list of conditions and the following disclaimer. -+ * -+ * 2. Redistributions in binary form must reproduce the above copyright -+ * notice, this list of conditions and the following disclaimer in the -+ * documentation and/or other materials provided with the distribution. -+ * -+ * 3. The name of the author may not be used to endorse or promote products -+ * derived from this software without specific prior written permission. -+ * -+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED -+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MER- -+ * CHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO -+ * EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPE- -+ * CIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, -+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; -+ * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, -+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTH- -+ * ERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED -+ * OF THE POSSIBILITY OF SUCH DAMAGE. -+ * -+ * Alternatively, the contents of this file may be used under the terms of -+ * the GNU General Public License version 2 (the "GPL"), in which case the -+ * provisions of the GPL are applicable instead of the above. If you wish to -+ * allow the use of your version of this file only under the terms of the -+ * GPL and not to allow others to use your version of this file under the -+ * BSD license, indicate your decision by deleting the provisions above and -+ * replace them with the notice and other provisions required by the GPL. If -+ * you do not delete the provisions above, a recipient may use your version -+ * of this file under either the BSD or the GPL. -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+ -+struct lzf_ctx { -+ void *hbuf; -+ unsigned int bufofs; -+}; -+ -+/* -+ * size of hashtable is (1 << hlog) * sizeof (char *) -+ * decompression is independent of the hash table size -+ * the difference between 15 and 14 is very small -+ * for small blocks (and 14 is also faster). -+ * For a low-memory configuration, use hlog == 13; -+ * For best compression, use 15 or 16. -+ */ -+static const int hlog = 13; -+ -+/* -+ * don't play with this unless you benchmark! -+ * decompression is not dependent on the hash function -+ * the hashing function might seem strange, just believe me -+ * it works ;) -+ */ -+static inline u16 first(const u8 *p) -+{ -+ return ((p[0]) << 8) + p[1]; -+} -+ -+static inline u16 next(u8 v, const u8 *p) -+{ -+ return ((v) << 8) + p[2]; -+} -+ -+static inline u32 idx(unsigned int h) -+{ -+ return (((h ^ (h << 5)) >> (3*8 - hlog)) + h*3) & ((1 << hlog) - 1); -+} -+ -+/* -+ * IDX works because it is very similar to a multiplicative hash, e.g. -+ * (h * 57321 >> (3*8 - hlog)) -+ * the next one is also quite good, albeit slow ;) -+ * (int)(cos(h & 0xffffff) * 1e6) -+ */ -+ -+static const int max_lit = (1 << 5); -+static const int max_off = (1 << 13); -+static const int max_ref = ((1 << 8) + (1 << 3)); -+ -+/* -+ * compressed format -+ * -+ * 000LLLLL ; literal -+ * LLLOOOOO oooooooo ; backref L -+ * 111OOOOO LLLLLLLL oooooooo ; backref L+7 -+ * -+ */ -+ -+static void lzf_compress_exit(struct crypto_tfm *tfm) -+{ -+ struct lzf_ctx *ctx = crypto_tfm_ctx(tfm); -+ -+ if (!ctx->hbuf) -+ return; -+ -+ vfree(ctx->hbuf); -+ ctx->hbuf = NULL; -+} -+ -+static int lzf_compress_init(struct crypto_tfm *tfm) -+{ -+ struct lzf_ctx *ctx = crypto_tfm_ctx(tfm); -+ -+ /* Get LZF ready to go */ -+ ctx->hbuf = vmalloc_32((1 << hlog) * sizeof(char *)); -+ if (ctx->hbuf) -+ return 0; -+ -+ printk(KERN_WARNING "Failed to allocate %ld bytes for lzf workspace\n", -+ (long) ((1 << hlog) * sizeof(char *))); -+ return -ENOMEM; -+} -+ -+static int lzf_compress(struct crypto_tfm *tfm, const u8 *in_data, -+ unsigned int in_len, u8 *out_data, unsigned int *out_len) -+{ -+ struct lzf_ctx *ctx = crypto_tfm_ctx(tfm); -+ const u8 **htab = ctx->hbuf; -+ const u8 **hslot; -+ const u8 *ip = in_data; -+ u8 *op = out_data; -+ const u8 *in_end = ip + in_len; -+ u8 *out_end = op + *out_len - 3; -+ const u8 *ref; -+ -+ unsigned int hval = first(ip); -+ unsigned long off; -+ int lit = 0; -+ -+ memset(htab, 0, sizeof(htab)); -+ -+ for (;;) { -+ if (ip < in_end - 2) { -+ hval = next(hval, ip); -+ hslot = htab + idx(hval); -+ ref = *hslot; -+ *hslot = ip; -+ -+ off = ip - ref - 1; -+ if (off < max_off -+ && ip + 4 < in_end && ref > in_data -+ && *(u16 *) ref == *(u16 *) ip && ref[2] == ip[2] -+ ) { -+ /* match found at *ref++ */ -+ unsigned int len = 2; -+ unsigned int maxlen = in_end - ip - len; -+ maxlen = maxlen > max_ref ? max_ref : maxlen; -+ -+ do { -+ len++; -+ } while (len < maxlen && ref[len] == ip[len]); -+ -+ if (op + lit + 1 + 3 >= out_end) { -+ *out_len = PAGE_SIZE; -+ return 0; -+ } -+ -+ if (lit) { -+ *op++ = lit - 1; -+ lit = -lit; -+ do { -+ *op++ = ip[lit]; -+ } while (++lit); -+ } -+ -+ len -= 2; -+ ip++; -+ -+ if (len < 7) { -+ *op++ = (off >> 8) + (len << 5); -+ } else { -+ *op++ = (off >> 8) + (7 << 5); -+ *op++ = len - 7; -+ } -+ -+ *op++ = off; -+ -+ ip += len; -+ hval = first(ip); -+ hval = next(hval, ip); -+ htab[idx(hval)] = ip; -+ ip++; -+ continue; -+ } -+ } else if (ip == in_end) -+ break; -+ -+ /* one more literal byte we must copy */ -+ lit++; -+ ip++; -+ -+ if (lit == max_lit) { -+ if (op + 1 + max_lit >= out_end) { -+ *out_len = PAGE_SIZE; -+ return 0; -+ } -+ -+ *op++ = max_lit - 1; -+ memcpy(op, ip - max_lit, max_lit); -+ op += max_lit; -+ lit = 0; -+ } -+ } -+ -+ if (lit) { -+ if (op + lit + 1 >= out_end) { -+ *out_len = PAGE_SIZE; -+ return 0; -+ } -+ -+ *op++ = lit - 1; -+ lit = -lit; -+ do { -+ *op++ = ip[lit]; -+ } while (++lit); -+ } -+ -+ *out_len = op - out_data; -+ return 0; -+} -+ -+static int lzf_decompress(struct crypto_tfm *tfm, const u8 *src, -+ unsigned int slen, u8 *dst, unsigned int *dlen) -+{ -+ u8 const *ip = src; -+ u8 *op = dst; -+ u8 const *const in_end = ip + slen; -+ u8 *const out_end = op + *dlen; -+ -+ *dlen = PAGE_SIZE; -+ do { -+ unsigned int ctrl = *ip++; -+ -+ if (ctrl < (1 << 5)) { -+ /* literal run */ -+ ctrl++; -+ -+ if (op + ctrl > out_end) -+ return 0; -+ memcpy(op, ip, ctrl); -+ op += ctrl; -+ ip += ctrl; -+ } else { /* back reference */ -+ -+ unsigned int len = ctrl >> 5; -+ -+ u8 *ref = op - ((ctrl & 0x1f) << 8) - 1; -+ -+ if (len == 7) -+ len += *ip++; -+ -+ ref -= *ip++; -+ len += 2; -+ -+ if (op + len > out_end || ref < (u8 *) dst) -+ return 0; -+ -+ do { -+ *op++ = *ref++; -+ } while (--len); -+ } -+ } while (op < out_end && ip < in_end); -+ -+ *dlen = op - (u8 *) dst; -+ return 0; -+} -+ -+static struct crypto_alg alg = { -+ .cra_name = "lzf", -+ .cra_flags = CRYPTO_ALG_TYPE_COMPRESS, -+ .cra_ctxsize = sizeof(struct lzf_ctx), -+ .cra_module = THIS_MODULE, -+ .cra_list = LIST_HEAD_INIT(alg.cra_list), -+ .cra_init = lzf_compress_init, -+ .cra_exit = lzf_compress_exit, -+ .cra_u = { .compress = { -+ .coa_compress = lzf_compress, -+ .coa_decompress = lzf_decompress } } -+}; -+ -+static int __init init(void) -+{ -+ return crypto_register_alg(&alg); -+} -+ -+static void __exit fini(void) -+{ -+ crypto_unregister_alg(&alg); -+} -+ -+module_init(init); -+module_exit(fini); -+ -+MODULE_LICENSE("GPL"); -+MODULE_DESCRIPTION("LZF Compression Algorithm"); -+MODULE_AUTHOR("Marc Alexander Lehmann & Nigel Cunningham"); -diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c -index a5142bd..3fed8b2 100644 ---- a/drivers/base/power/main.c -+++ b/drivers/base/power/main.c -@@ -66,6 +66,7 @@ void device_pm_lock(void) - { - mutex_lock(&dpm_list_mtx); - } -+EXPORT_SYMBOL_GPL(device_pm_lock); - - /** - * device_pm_unlock - Unlock the list of active devices used by the PM core. -@@ -74,6 +75,7 @@ void device_pm_unlock(void) - { - mutex_unlock(&dpm_list_mtx); - } -+EXPORT_SYMBOL_GPL(device_pm_unlock); - - /** - * device_pm_add - Add a device to the PM core's list of active devices. -diff --git a/drivers/char/vt.c b/drivers/char/vt.c -index 50faa1f..567839a 100644 ---- a/drivers/char/vt.c -+++ b/drivers/char/vt.c -@@ -2465,6 +2465,7 @@ int vt_kmsg_redirect(int new) - else - return kmsg_con; - } -+EXPORT_SYMBOL_GPL(vt_kmsg_redirect); - - /* - * Console on virtual terminal -diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c -index 8bf3770..f1d1e03 100644 ---- a/drivers/gpu/drm/drm_gem.c -+++ b/drivers/gpu/drm/drm_gem.c -@@ -138,7 +138,8 @@ drm_gem_object_alloc(struct drm_device *dev, size_t size) - goto free; - - obj->dev = dev; -- obj->filp = shmem_file_setup("drm mm object", size, VM_NORESERVE); -+ obj->filp = shmem_file_setup("drm mm object", size, -+ VM_NORESERVE | VM_ATOMIC_COPY); - if (IS_ERR(obj->filp)) - goto free; - -diff --git a/drivers/md/md.c b/drivers/md/md.c -index a20a71e..6f221e5 100644 ---- a/drivers/md/md.c -+++ b/drivers/md/md.c -@@ -6469,6 +6469,9 @@ void md_do_sync(mddev_t *mddev) - mddev->curr_resync = 2; - - try_again: -+ while (freezer_is_on()) -+ yield(); -+ - if (kthread_should_stop()) - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - -@@ -6491,6 +6494,10 @@ void md_do_sync(mddev_t *mddev) - * time 'round when curr_resync == 2 - */ - continue; -+ -+ while (freezer_is_on()) -+ yield(); -+ - /* We need to wait 'interruptible' so as not to - * contribute to the load average, and not to - * be caught by 'softlockup' -@@ -6503,6 +6510,7 @@ void md_do_sync(mddev_t *mddev) - " share one or more physical units)\n", - desc, mdname(mddev), mdname(mddev2)); - mddev_put(mddev2); -+ try_to_freeze(); - if (signal_pending(current)) - flush_signals(current); - schedule(); -@@ -6612,6 +6620,9 @@ void md_do_sync(mddev_t *mddev) - || kthread_should_stop()); - } - -+ while (freezer_is_on()) -+ yield(); -+ - if (kthread_should_stop()) - goto interrupted; - -@@ -6656,6 +6667,9 @@ void md_do_sync(mddev_t *mddev) - last_mark = next; - } - -+ while (freezer_is_on()) -+ yield(); -+ - - if (kthread_should_stop()) - goto interrupted; -diff --git a/fs/block_dev.c b/fs/block_dev.c -index d11d028..b2388cc 100644 ---- a/fs/block_dev.c -+++ b/fs/block_dev.c -@@ -335,6 +335,93 @@ out_unlock: - } - EXPORT_SYMBOL(thaw_bdev); - -+#ifdef CONFIG_FS_FREEZER_DEBUG -+#define FS_PRINTK(fmt, args...) printk(fmt, ## args) -+#else -+#define FS_PRINTK(fmt, args...) -+#endif -+ -+/* #define DEBUG_FS_FREEZING */ -+ -+/** -+ * freeze_filesystems - lock all filesystems and force them into a consistent -+ * state -+ * @which: What combination of fuse & non-fuse to freeze. -+ */ -+void freeze_filesystems(int which) -+{ -+ struct super_block *sb; -+ -+ lockdep_off(); -+ -+ /* -+ * Freeze in reverse order so filesystems dependant upon others are -+ * frozen in the right order (eg. loopback on ext3). -+ */ -+ list_for_each_entry_reverse(sb, &super_blocks, s_list) { -+ FS_PRINTK(KERN_INFO "Considering %s.%s: (root %p, bdev %x)", -+ sb->s_type->name ? sb->s_type->name : "?", -+ sb->s_subtype ? sb->s_subtype : "", sb->s_root, -+ sb->s_bdev ? sb->s_bdev->bd_dev : 0); -+ -+ if (sb->s_type->fs_flags & FS_IS_FUSE && -+ sb->s_frozen == SB_UNFROZEN && -+ which & FS_FREEZER_FUSE) { -+ sb->s_frozen = SB_FREEZE_TRANS; -+ sb->s_flags |= MS_FROZEN; -+ FS_PRINTK("Fuse filesystem done.\n"); -+ continue; -+ } -+ -+ if (!sb->s_root || !sb->s_bdev || -+ (sb->s_frozen == SB_FREEZE_TRANS) || -+ (sb->s_flags & MS_RDONLY) || -+ (sb->s_flags & MS_FROZEN) || -+ !(which & FS_FREEZER_NORMAL)) { -+ FS_PRINTK(KERN_INFO "Nope.\n"); -+ continue; -+ } -+ -+ FS_PRINTK(KERN_INFO "Freezing %x... ", sb->s_bdev->bd_dev); -+ freeze_bdev(sb->s_bdev); -+ sb->s_flags |= MS_FROZEN; -+ FS_PRINTK(KERN_INFO "Done.\n"); -+ } -+ -+ lockdep_on(); -+} -+ -+/** -+ * thaw_filesystems - unlock all filesystems -+ * @which: What combination of fuse & non-fuse to thaw. -+ */ -+void thaw_filesystems(int which) -+{ -+ struct super_block *sb; -+ -+ lockdep_off(); -+ -+ list_for_each_entry(sb, &super_blocks, s_list) { -+ if (!(sb->s_flags & MS_FROZEN)) -+ continue; -+ -+ if (sb->s_type->fs_flags & FS_IS_FUSE) { -+ if (!(which & FS_FREEZER_FUSE)) -+ continue; -+ -+ sb->s_frozen = SB_UNFROZEN; -+ } else { -+ if (!(which & FS_FREEZER_NORMAL)) -+ continue; -+ -+ thaw_bdev(sb->s_bdev, sb); -+ } -+ sb->s_flags &= ~MS_FROZEN; -+ } -+ -+ lockdep_on(); -+} -+ - static int blkdev_writepage(struct page *page, struct writeback_control *wbc) - { - return block_write_full_page(page, blkdev_get_block, wbc); -diff --git a/fs/drop_caches.c b/fs/drop_caches.c -index 31f4b0e..ff7df7a 100644 ---- a/fs/drop_caches.c -+++ b/fs/drop_caches.c -@@ -8,6 +8,7 @@ - #include - #include - #include -+#include - - /* A global variable is a bit ugly, but it keeps the code simple */ - int sysctl_drop_caches; -@@ -33,7 +34,7 @@ static void drop_pagecache_sb(struct super_block *sb) - iput(toput_inode); - } - --static void drop_pagecache(void) -+void drop_pagecache(void) - { - struct super_block *sb; - -@@ -61,6 +62,7 @@ static void drop_slab(void) - nr_objects = shrink_slab(1000, GFP_KERNEL, 1000); - } while (nr_objects > 10); - } -+EXPORT_SYMBOL_GPL(drop_pagecache); - - int drop_caches_sysctl_handler(ctl_table *table, int write, - void __user *buffer, size_t *length, loff_t *ppos) -diff --git a/fs/fuse/control.c b/fs/fuse/control.c -index 3773fd6..6272b60 100644 ---- a/fs/fuse/control.c -+++ b/fs/fuse/control.c -@@ -341,6 +341,7 @@ static void fuse_ctl_kill_sb(struct super_block *sb) - static struct file_system_type fuse_ctl_fs_type = { - .owner = THIS_MODULE, - .name = "fusectl", -+ .fs_flags = FS_IS_FUSE, - .get_sb = fuse_ctl_get_sb, - .kill_sb = fuse_ctl_kill_sb, - }; -diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c -index 51d9e33..12ad477 100644 ---- a/fs/fuse/dev.c -+++ b/fs/fuse/dev.c -@@ -7,6 +7,7 @@ - */ - - #include "fuse_i.h" -+#include "fuse.h" - - #include - #include -@@ -16,6 +17,7 @@ - #include - #include - #include -+#include - - MODULE_ALIAS_MISCDEV(FUSE_MINOR); - -@@ -758,6 +760,8 @@ static ssize_t fuse_dev_read(struct kiocb *iocb, const struct iovec *iov, - if (!fc) - return -EPERM; - -+ FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_dev_read"); -+ - restart: - spin_lock(&fc->lock); - err = -EAGAIN; -@@ -999,6 +1003,9 @@ static ssize_t fuse_dev_write(struct kiocb *iocb, const struct iovec *iov, - if (!fc) - return -EPERM; - -+ FUSE_MIGHT_FREEZE(iocb->ki_filp->f_mapping->host->i_sb, -+ "fuse_dev_write"); -+ - fuse_copy_init(&cs, fc, 0, NULL, iov, nr_segs); - if (nbytes < sizeof(struct fuse_out_header)) - return -EINVAL; -diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c -index 4787ae6..797b7dd 100644 ---- a/fs/fuse/dir.c -+++ b/fs/fuse/dir.c -@@ -7,12 +7,14 @@ - */ - - #include "fuse_i.h" -+#include "fuse.h" - - #include - #include - #include - #include - #include -+#include - - #if BITS_PER_LONG >= 64 - static inline void fuse_dentry_settime(struct dentry *entry, u64 time) -@@ -174,6 +176,9 @@ static int fuse_dentry_revalidate(struct dentry *entry, struct nameidata *nd) - return 0; - - fc = get_fuse_conn(inode); -+ -+ FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_dentry_revalidate"); -+ - req = fuse_get_req(fc); - if (IS_ERR(req)) - return 0; -@@ -268,6 +273,8 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid, struct qstr *name, - if (name->len > FUSE_NAME_MAX) - goto out; - -+ FUSE_MIGHT_FREEZE(sb, "fuse_lookup_name"); -+ - req = fuse_get_req(fc); - err = PTR_ERR(req); - if (IS_ERR(req)) -@@ -331,6 +338,8 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry, - if (err) - goto out_err; - -+ FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_lookup"); -+ - err = -EIO; - if (inode && get_node_id(inode) == FUSE_ROOT_ID) - goto out_iput; -@@ -392,6 +401,8 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, int mode, - if (IS_ERR(forget_req)) - return PTR_ERR(forget_req); - -+ FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_create_open"); -+ - req = fuse_get_req(fc); - err = PTR_ERR(req); - if (IS_ERR(req)) -@@ -485,6 +496,8 @@ static int create_new_entry(struct fuse_conn *fc, struct fuse_req *req, - int err; - struct fuse_req *forget_req; - -+ FUSE_MIGHT_FREEZE(dir->i_sb, "create_new_entry"); -+ - forget_req = fuse_get_req(fc); - if (IS_ERR(forget_req)) { - fuse_put_request(fc, req); -@@ -587,7 +600,11 @@ static int fuse_mkdir(struct inode *dir, struct dentry *entry, int mode) - { - struct fuse_mkdir_in inarg; - struct fuse_conn *fc = get_fuse_conn(dir); -- struct fuse_req *req = fuse_get_req(fc); -+ struct fuse_req *req; -+ -+ FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_mkdir"); -+ -+ req = fuse_get_req(fc); - if (IS_ERR(req)) - return PTR_ERR(req); - -@@ -611,7 +628,11 @@ static int fuse_symlink(struct inode *dir, struct dentry *entry, - { - struct fuse_conn *fc = get_fuse_conn(dir); - unsigned len = strlen(link) + 1; -- struct fuse_req *req = fuse_get_req(fc); -+ struct fuse_req *req; -+ -+ FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_symlink"); -+ -+ req = fuse_get_req(fc); - if (IS_ERR(req)) - return PTR_ERR(req); - -@@ -628,7 +649,11 @@ static int fuse_unlink(struct inode *dir, struct dentry *entry) - { - int err; - struct fuse_conn *fc = get_fuse_conn(dir); -- struct fuse_req *req = fuse_get_req(fc); -+ struct fuse_req *req; -+ -+ FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_unlink"); -+ -+ req = fuse_get_req(fc); - if (IS_ERR(req)) - return PTR_ERR(req); - -@@ -661,7 +686,11 @@ static int fuse_rmdir(struct inode *dir, struct dentry *entry) - { - int err; - struct fuse_conn *fc = get_fuse_conn(dir); -- struct fuse_req *req = fuse_get_req(fc); -+ struct fuse_req *req; -+ -+ FUSE_MIGHT_FREEZE(dir->i_sb, "fuse_rmdir"); -+ -+ req = fuse_get_req(fc); - if (IS_ERR(req)) - return PTR_ERR(req); - -diff --git a/fs/fuse/file.c b/fs/fuse/file.c -index a9f5e13..4647e11 100644 ---- a/fs/fuse/file.c -+++ b/fs/fuse/file.c -@@ -7,11 +7,13 @@ - */ - - #include "fuse_i.h" -+#include "fuse.h" - - #include - #include - #include - #include -+#include - #include - - static const struct file_operations fuse_direct_io_file_operations; -@@ -109,6 +111,8 @@ int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file, - int err; - int opcode = isdir ? FUSE_OPENDIR : FUSE_OPEN; - -+ FUSE_MIGHT_FREEZE(file->f_path.dentry->d_inode->i_sb, "fuse_send_open"); -+ - ff = fuse_file_alloc(fc); - if (!ff) - return -ENOMEM; -@@ -316,6 +320,8 @@ static int fuse_flush(struct file *file, fl_owner_t id) - if (fc->no_flush) - return 0; - -+ FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_flush"); -+ - req = fuse_get_req_nofail(fc, file); - memset(&inarg, 0, sizeof(inarg)); - inarg.fh = ff->fh; -@@ -367,6 +373,8 @@ int fuse_fsync_common(struct file *file, struct dentry *de, int datasync, - if ((!isdir && fc->no_fsync) || (isdir && fc->no_fsyncdir)) - return 0; - -+ FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_fsync_common"); -+ - /* - * Start writeback against all dirty pages of the inode, then - * wait for all outstanding writes, before sending the FSYNC -@@ -474,6 +482,8 @@ static int fuse_readpage(struct file *file, struct page *page) - if (is_bad_inode(inode)) - goto out; - -+ FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_readpage"); -+ - /* - * Page writeback can extend beyond the liftime of the - * page-cache page, so make sure we read a properly synced -@@ -576,6 +586,9 @@ static int fuse_readpages_fill(void *_data, struct page *page) - struct inode *inode = data->inode; - struct fuse_conn *fc = get_fuse_conn(inode); - -+ FUSE_MIGHT_FREEZE(data->file->f_mapping->host->i_sb, -+ "fuse_readpages_fill"); -+ - fuse_wait_on_page_writeback(inode, page->index); - - if (req->num_pages && -@@ -606,6 +619,8 @@ static int fuse_readpages(struct file *file, struct address_space *mapping, - if (is_bad_inode(inode)) - goto out; - -+ FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_readpages"); -+ - data.file = file; - data.inode = inode; - data.req = fuse_get_req(fc); -@@ -719,6 +734,8 @@ static int fuse_buffered_write(struct file *file, struct inode *inode, - if (is_bad_inode(inode)) - return -EIO; - -+ FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_buffered_write"); -+ - /* - * Make sure writepages on the same page are not mixed up with - * plain writes. -@@ -878,6 +895,8 @@ static ssize_t fuse_perform_write(struct file *file, - struct fuse_req *req; - ssize_t count; - -+ FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_perform_write"); -+ - req = fuse_get_req(fc); - if (IS_ERR(req)) { - err = PTR_ERR(req); -@@ -1025,6 +1044,8 @@ ssize_t fuse_direct_io(struct file *file, const char __user *buf, - ssize_t res = 0; - struct fuse_req *req; - -+ FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_direct_io"); -+ - req = fuse_get_req(fc); - if (IS_ERR(req)) - return PTR_ERR(req); -@@ -1412,6 +1433,8 @@ static int fuse_getlk(struct file *file, struct file_lock *fl) - struct fuse_lk_out outarg; - int err; - -+ FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_getlk"); -+ - req = fuse_get_req(fc); - if (IS_ERR(req)) - return PTR_ERR(req); -@@ -1447,6 +1470,8 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock) - if (fl->fl_flags & FL_CLOSE) - return 0; - -+ FUSE_MIGHT_FREEZE(file->f_mapping->host->i_sb, "fuse_setlk"); -+ - req = fuse_get_req(fc); - if (IS_ERR(req)) - return PTR_ERR(req); -@@ -1513,6 +1538,8 @@ static sector_t fuse_bmap(struct address_space *mapping, sector_t block) - if (!inode->i_sb->s_bdev || fc->no_bmap) - return 0; - -+ FUSE_MIGHT_FREEZE(inode->i_sb, "fuse_bmap"); -+ - req = fuse_get_req(fc); - if (IS_ERR(req)) - return 0; -diff --git a/fs/fuse/fuse.h b/fs/fuse/fuse.h -new file mode 100644 -index 0000000..170e49a ---- /dev/null -+++ b/fs/fuse/fuse.h -@@ -0,0 +1,13 @@ -+#define FUSE_MIGHT_FREEZE(superblock, desc) \ -+do { \ -+ int printed = 0; \ -+ while (superblock->s_frozen != SB_UNFROZEN) { \ -+ if (!printed) { \ -+ printk(KERN_INFO "%d frozen in " desc ".\n", \ -+ current->pid); \ -+ printed = 1; \ -+ } \ -+ try_to_freeze(); \ -+ yield(); \ -+ } \ -+} while (0) -diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c -index 1a822ce..9b69d61 100644 ---- a/fs/fuse/inode.c -+++ b/fs/fuse/inode.c -@@ -1062,7 +1062,7 @@ static void fuse_kill_sb_anon(struct super_block *sb) - static struct file_system_type fuse_fs_type = { - .owner = THIS_MODULE, - .name = "fuse", -- .fs_flags = FS_HAS_SUBTYPE, -+ .fs_flags = FS_HAS_SUBTYPE | FS_IS_FUSE, - .get_sb = fuse_get_sb, - .kill_sb = fuse_kill_sb_anon, - }; -@@ -1094,7 +1094,7 @@ static struct file_system_type fuseblk_fs_type = { - .name = "fuseblk", - .get_sb = fuse_get_sb_blk, - .kill_sb = fuse_kill_sb_blk, -- .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE, -+ .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE | FS_IS_FUSE, - }; - - static inline int register_fuseblk(void) -diff --git a/fs/namei.c b/fs/namei.c -index a4855af..3d57581 100644 ---- a/fs/namei.c -+++ b/fs/namei.c -@@ -2268,6 +2268,8 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry) - if (!dir->i_op->unlink) - return -EPERM; - -+ vfs_check_frozen(dir->i_sb, SB_FREEZE_WRITE); -+ - vfs_dq_init(dir); - - mutex_lock(&dentry->d_inode->i_mutex); -diff --git a/fs/super.c b/fs/super.c -index aff046b..affb662 100644 ---- a/fs/super.c -+++ b/fs/super.c -@@ -42,6 +42,8 @@ - - - LIST_HEAD(super_blocks); -+EXPORT_SYMBOL_GPL(super_blocks); -+ - DEFINE_SPINLOCK(sb_lock); - - /** -diff --git a/include/linux/Kbuild b/include/linux/Kbuild -index 756f831..9953b39 100644 ---- a/include/linux/Kbuild -+++ b/include/linux/Kbuild -@@ -213,6 +213,7 @@ unifdef-y += filter.h - unifdef-y += flat.h - unifdef-y += futex.h - unifdef-y += fs.h -+unifdef-y += freezer.h - unifdef-y += gameport.h - unifdef-y += generic_serial.h - unifdef-y += hdlcdrv.h -diff --git a/include/linux/bio.h b/include/linux/bio.h -index 7fc5606..07e9b97 100644 ---- a/include/linux/bio.h -+++ b/include/linux/bio.h -@@ -175,8 +175,11 @@ enum bio_rw_flags { - BIO_RW_META, - BIO_RW_DISCARD, - BIO_RW_NOIDLE, -+ BIO_RW_TUXONICE, - }; - -+extern int trap_non_toi_io; -+ - /* - * First four bits must match between bio->bi_rw and rq->cmd_flags, make - * that explicit here. -diff --git a/include/linux/freezer.h b/include/linux/freezer.h -index 5a361f8..a66f2a9 100644 ---- a/include/linux/freezer.h -+++ b/include/linux/freezer.h -@@ -121,6 +121,19 @@ static inline void set_freezable(void) - current->flags &= ~PF_NOFREEZE; - } - -+extern int freezer_state; -+#define FREEZER_OFF 0 -+#define FREEZER_FILESYSTEMS_FROZEN 1 -+#define FREEZER_USERSPACE_FROZEN 2 -+#define FREEZER_FULLY_ON 3 -+ -+static inline int freezer_is_on(void) -+{ -+ return freezer_state == FREEZER_FULLY_ON; -+} -+ -+extern void thaw_kernel_threads(void); -+ - /* - * Tell the freezer that the current task should be frozen by it and that it - * should send a fake signal to the task to freeze it. -@@ -172,6 +185,8 @@ static inline int freeze_processes(void) { BUG(); return 0; } - static inline void thaw_processes(void) {} - - static inline int try_to_freeze(void) { return 0; } -+static inline int freezer_is_on(void) { return 0; } -+static inline void thaw_kernel_threads(void) { } - - static inline void freezer_do_not_count(void) {} - static inline void freezer_count(void) {} -diff --git a/include/linux/fs.h b/include/linux/fs.h -index ebb1cd5..e30e318 100644 ---- a/include/linux/fs.h -+++ b/include/linux/fs.h -@@ -173,6 +173,7 @@ struct inodes_stat_t { - #define FS_REQUIRES_DEV 1 - #define FS_BINARY_MOUNTDATA 2 - #define FS_HAS_SUBTYPE 4 -+#define FS_IS_FUSE 8 /* Fuse filesystem - bdev freeze these too */ - #define FS_REVAL_DOT 16384 /* Check the paths ".", ".." for staleness */ - #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() - * during rename() internally. -@@ -206,6 +207,7 @@ struct inodes_stat_t { - #define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */ - #define MS_I_VERSION (1<<23) /* Update inode I_version field */ - #define MS_STRICTATIME (1<<24) /* Always perform atime updates */ -+#define MS_FROZEN (1<<25) /* Frozen by freeze_filesystems() */ - #define MS_ACTIVE (1<<30) - #define MS_NOUSER (1<<31) - -@@ -232,6 +234,8 @@ struct inodes_stat_t { - #define S_NOCMTIME 128 /* Do not update file c/mtime */ - #define S_SWAPFILE 256 /* Do not truncate: swapon got its bmaps */ - #define S_PRIVATE 512 /* Inode is fs-internal */ -+#define S_ATOMIC_COPY 1024 /* Pages mapped with this inode need to be -+ atomically copied (gem) */ - - /* - * Note that nosuid etc flags are inode-specific: setting some file-system -@@ -379,6 +383,7 @@ struct inodes_stat_t { - #include - #include - #include -+#include - - #include - #include -@@ -1391,8 +1396,11 @@ enum { - SB_FREEZE_TRANS = 2, - }; - --#define vfs_check_frozen(sb, level) \ -- wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level))) -+#define vfs_check_frozen(sb, level) do { \ -+ freezer_do_not_count(); \ -+ wait_event((sb)->s_wait_unfrozen, ((sb)->s_frozen < (level))); \ -+ freezer_count(); \ -+} while (0) - - #define get_fs_excl() atomic_inc(¤t->fs_excl) - #define put_fs_excl() atomic_dec(¤t->fs_excl) -@@ -1947,6 +1955,13 @@ extern struct super_block *freeze_bdev(struct block_device *); - extern void emergency_thaw_all(void); - extern int thaw_bdev(struct block_device *bdev, struct super_block *sb); - extern int fsync_bdev(struct block_device *); -+extern int fsync_super(struct super_block *); -+extern int fsync_no_super(struct block_device *); -+#define FS_FREEZER_FUSE 1 -+#define FS_FREEZER_NORMAL 2 -+#define FS_FREEZER_ALL (FS_FREEZER_FUSE | FS_FREEZER_NORMAL) -+void freeze_filesystems(int which); -+void thaw_filesystems(int which); - #else - static inline void bd_forget(struct inode *inode) {} - static inline int sync_blockdev(struct block_device *bdev) { return 0; } -diff --git a/include/linux/mm.h b/include/linux/mm.h -index 60c467b..1e722c2 100644 ---- a/include/linux/mm.h -+++ b/include/linux/mm.h -@@ -97,6 +97,7 @@ extern unsigned int kobjsize(const void *objp); - #define VM_HUGETLB 0x00400000 /* Huge TLB Page VM */ - #define VM_NONLINEAR 0x00800000 /* Is non-linear (remap_file_pages) */ - #define VM_MAPPED_COPY 0x01000000 /* T if mapped copy of data (nommu mmap) */ -+#define VM_ATOMIC_COPY 0x01000000 /* TOI should do atomic copy (mmu) */ - #define VM_INSERTPAGE 0x02000000 /* The vma has had "vm_insert_page()" done on it */ - #define VM_ALWAYSDUMP 0x04000000 /* Always include in core dumps */ - -@@ -1309,6 +1310,7 @@ int drop_caches_sysctl_handler(struct ctl_table *, int, - void __user *, size_t *, loff_t *); - unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask, - unsigned long lru_pages); -+void drop_pagecache(void); - - #ifndef CONFIG_MMU - #define randomize_va_space 0 -diff --git a/include/linux/netlink.h b/include/linux/netlink.h -index fde27c0..0d86cc7 100644 ---- a/include/linux/netlink.h -+++ b/include/linux/netlink.h -@@ -24,6 +24,8 @@ - /* leave room for NETLINK_DM (DM Events) */ - #define NETLINK_SCSITRANSPORT 18 /* SCSI Transports */ - #define NETLINK_ECRYPTFS 19 -+#define NETLINK_TOI_USERUI 20 /* TuxOnIce's userui */ -+#define NETLINK_TOI_USM 21 /* Userspace storage manager */ - - #define MAX_LINKS 32 - -diff --git a/include/linux/suspend.h b/include/linux/suspend.h -index 5e781d8..a1c07f3 100644 ---- a/include/linux/suspend.h -+++ b/include/linux/suspend.h -@@ -329,4 +329,70 @@ static inline void unlock_system_sleep(void) - } - #endif - -+enum { -+ TOI_CAN_HIBERNATE, -+ TOI_CAN_RESUME, -+ TOI_RESUME_DEVICE_OK, -+ TOI_NORESUME_SPECIFIED, -+ TOI_SANITY_CHECK_PROMPT, -+ TOI_CONTINUE_REQ, -+ TOI_RESUMED_BEFORE, -+ TOI_BOOT_TIME, -+ TOI_NOW_RESUMING, -+ TOI_IGNORE_LOGLEVEL, -+ TOI_TRYING_TO_RESUME, -+ TOI_LOADING_ALT_IMAGE, -+ TOI_STOP_RESUME, -+ TOI_IO_STOPPED, -+ TOI_NOTIFIERS_PREPARE, -+ TOI_CLUSTER_MODE, -+ TOI_BOOT_KERNEL, -+}; -+ -+#ifdef CONFIG_TOI -+ -+/* Used in init dir files */ -+extern unsigned long toi_state; -+#define set_toi_state(bit) (set_bit(bit, &toi_state)) -+#define clear_toi_state(bit) (clear_bit(bit, &toi_state)) -+#define test_toi_state(bit) (test_bit(bit, &toi_state)) -+extern int toi_running; -+ -+#define test_action_state(bit) (test_bit(bit, &toi_bkd.toi_action)) -+extern int try_tuxonice_hibernate(void); -+ -+#else /* !CONFIG_TOI */ -+ -+#define toi_state (0) -+#define set_toi_state(bit) do { } while (0) -+#define clear_toi_state(bit) do { } while (0) -+#define test_toi_state(bit) (0) -+#define toi_running (0) -+ -+static inline int try_tuxonice_hibernate(void) { return 0; } -+#define test_action_state(bit) (0) -+ -+#endif /* CONFIG_TOI */ -+ -+#ifdef CONFIG_HIBERNATION -+#ifdef CONFIG_TOI -+extern void try_tuxonice_resume(void); -+#else -+#define try_tuxonice_resume() do { } while (0) -+#endif -+ -+extern int resume_attempted; -+extern int software_resume(void); -+ -+static inline void check_resume_attempted(void) -+{ -+ if (resume_attempted) -+ return; -+ -+ software_resume(); -+} -+#else -+#define check_resume_attempted() do { } while (0) -+#define resume_attempted (0) -+#endif - #endif /* _LINUX_SUSPEND_H */ -diff --git a/include/linux/swap.h b/include/linux/swap.h -index a2602a8..06c4630 100644 ---- a/include/linux/swap.h -+++ b/include/linux/swap.h -@@ -196,6 +196,7 @@ struct swap_list_t { - extern unsigned long totalram_pages; - extern unsigned long totalreserve_pages; - extern unsigned int nr_free_buffer_pages(void); -+extern unsigned int nr_unallocated_buffer_pages(void); - extern unsigned int nr_free_pagecache_pages(void); - - /* Definition of global_page_state not available yet */ -@@ -325,8 +326,10 @@ extern void swapcache_free(swp_entry_t, struct page *page); - extern int free_swap_and_cache(swp_entry_t); - extern int swap_type_of(dev_t, sector_t, struct block_device **); - extern unsigned int count_swap_pages(int, int); -+extern sector_t map_swap_entry(swp_entry_t entry, struct block_device **); - extern sector_t map_swap_page(struct page *, struct block_device **); - extern sector_t swapdev_block(int, pgoff_t); -+extern struct swap_info_struct *get_swap_info_struct(unsigned); - extern int reuse_swap_page(struct page *); - extern int try_to_free_swap(struct page *); - struct backing_dev_info; -diff --git a/include/linux/uuid.h b/include/linux/uuid.h -new file mode 100644 -index 0000000..a968f0f ---- /dev/null -+++ b/include/linux/uuid.h -@@ -0,0 +1,18 @@ -+#include -+ -+struct hd_struct; -+struct block_device; -+ -+struct fs_info { -+ char uuid[16]; -+ char *last_mount; -+ int last_mount_size; -+}; -+ -+int part_matches_uuid(struct hd_struct *part, const char *uuid); -+dev_t blk_lookup_uuid(const char *uuid); -+struct fs_info *fs_info_from_block_dev(struct block_device *bdev); -+void free_fs_info(struct fs_info *fs_info); -+int bdev_matches_key(struct block_device *bdev, const char *key); -+struct block_device *next_bdev_of_type(struct block_device *last, -+ const char *key); -diff --git a/init/do_mounts.c b/init/do_mounts.c -index bb008d0..5273dc9 100644 ---- a/init/do_mounts.c -+++ b/init/do_mounts.c -@@ -143,6 +143,7 @@ fail: - done: - return res; - } -+EXPORT_SYMBOL_GPL(name_to_dev_t); - - static int __init root_dev_setup(char *line) - { -@@ -413,6 +414,8 @@ void __init prepare_namespace(void) - if (is_floppy && rd_doload && rd_load_disk(0)) - ROOT_DEV = Root_RAM0; - -+ check_resume_attempted(); -+ - mount_root(); - out: - devtmpfs_mount("dev"); -diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c -index 614241b..f3ea292 100644 ---- a/init/do_mounts_initrd.c -+++ b/init/do_mounts_initrd.c -@@ -6,6 +6,7 @@ - #include - #include - #include -+#include - #include - - #include "do_mounts.h" -@@ -68,6 +69,11 @@ static void __init handle_initrd(void) - - current->flags &= ~PF_FREEZER_SKIP; - -+ if (!resume_attempted) -+ printk(KERN_ERR "TuxOnIce: No attempt was made to resume from " -+ "any image that might exist.\n"); -+ clear_toi_state(TOI_BOOT_TIME); -+ - /* move initrd to rootfs' /old */ - sys_fchdir(old_fd); - sys_mount("/", ".", NULL, MS_MOVE, NULL); -diff --git a/init/main.c b/init/main.c -index 4cb47a1..36eac80 100644 ---- a/init/main.c -+++ b/init/main.c -@@ -116,6 +116,7 @@ extern void softirq_init(void); - char __initdata boot_command_line[COMMAND_LINE_SIZE]; - /* Untouched saved command line (eg. for /proc) */ - char *saved_command_line; -+EXPORT_SYMBOL_GPL(saved_command_line); - /* Command line for parameter parsing */ - static char *static_command_line; - -diff --git a/kernel/cpu.c b/kernel/cpu.c -index 677f253..aad27c8 100644 ---- a/kernel/cpu.c -+++ b/kernel/cpu.c -@@ -402,6 +402,7 @@ int disable_nonboot_cpus(void) - stop_machine_destroy(); - return error; - } -+EXPORT_SYMBOL_GPL(disable_nonboot_cpus); - - void __weak arch_enable_nonboot_cpus_begin(void) - { -@@ -440,6 +441,7 @@ void __ref enable_nonboot_cpus(void) - out: - cpu_maps_update_done(); - } -+EXPORT_SYMBOL_GPL(enable_nonboot_cpus); - - static int alloc_frozen_cpus(void) - { -diff --git a/kernel/fork.c b/kernel/fork.c -index f88bd98..17bbf09 100644 ---- a/kernel/fork.c -+++ b/kernel/fork.c -@@ -86,6 +86,7 @@ int max_threads; /* tunable limit on nr_threads */ - DEFINE_PER_CPU(unsigned long, process_counts) = 0; - - __cacheline_aligned DEFINE_RWLOCK(tasklist_lock); /* outer */ -+EXPORT_SYMBOL_GPL(tasklist_lock); - - int nr_processes(void) - { -diff --git a/kernel/kmod.c b/kernel/kmod.c -index bf0e231..de63918 100644 ---- a/kernel/kmod.c -+++ b/kernel/kmod.c -@@ -326,6 +326,7 @@ int usermodehelper_disable(void) - usermodehelper_disabled = 0; - return -EAGAIN; - } -+EXPORT_SYMBOL_GPL(usermodehelper_disable); - - /** - * usermodehelper_enable - allow new helpers to be started again -@@ -334,6 +335,7 @@ void usermodehelper_enable(void) - { - usermodehelper_disabled = 0; - } -+EXPORT_SYMBOL_GPL(usermodehelper_enable); - - static void helper_lock(void) - { -diff --git a/kernel/pid.c b/kernel/pid.c -index 2e17c9c..f83eb67 100644 ---- a/kernel/pid.c -+++ b/kernel/pid.c -@@ -382,6 +382,7 @@ struct task_struct *find_task_by_pid_ns(pid_t nr, struct pid_namespace *ns) - { - return pid_task(find_pid_ns(nr, ns), PIDTYPE_PID); - } -+EXPORT_SYMBOL_GPL(find_task_by_pid_ns); - - struct task_struct *find_task_by_vpid(pid_t vnr) - { -diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig -index 91e09d3..733ff64 100644 ---- a/kernel/power/Kconfig -+++ b/kernel/power/Kconfig -@@ -38,6 +38,13 @@ config CAN_PM_TRACE - def_bool y - depends on PM_DEBUG && PM_SLEEP && EXPERIMENTAL - -+config FS_FREEZER_DEBUG -+ bool "Filesystem freezer debugging" -+ depends on PM_DEBUG -+ default n -+ ---help--- -+ This option enables debugging of the filesystem freezing code. -+ - config PM_TRACE - bool - help -@@ -183,6 +190,238 @@ config PM_STD_PARTITION - suspended image to. It will simply pick the first available swap - device. - -+menuconfig TOI_CORE -+ tristate "Enhanced Hibernation (TuxOnIce)" -+ depends on HIBERNATION -+ default y -+ ---help--- -+ TuxOnIce is the 'new and improved' suspend support. -+ -+ See the TuxOnIce home page (tuxonice.net) -+ for FAQs, HOWTOs and other documentation. -+ -+ comment "Image Storage (you need at least one allocator)" -+ depends on TOI_CORE -+ -+ config TOI_FILE -+ tristate "File Allocator" -+ depends on TOI_CORE -+ default y -+ ---help--- -+ This option enables support for storing an image in a -+ simple file. You might want this if your swap is -+ sometimes full enough that you don't have enough spare -+ space to store an image. -+ -+ config TOI_SWAP -+ tristate "Swap Allocator" -+ depends on TOI_CORE && SWAP -+ default y -+ ---help--- -+ This option enables support for storing an image in your -+ swap space. -+ -+ comment "General Options" -+ depends on TOI_CORE -+ -+ config TOI_CRYPTO -+ tristate "Compression support" -+ depends on TOI_CORE && CRYPTO -+ default y -+ ---help--- -+ This option adds support for using cryptoapi compression -+ algorithms. Compression is particularly useful as it can -+ more than double your suspend and resume speed (depending -+ upon how well your image compresses). -+ -+ You probably want this, so say Y here. -+ -+ comment "No compression support available without Cryptoapi support." -+ depends on TOI_CORE && !CRYPTO -+ -+ config TOI_USERUI -+ tristate "Userspace User Interface support" -+ depends on TOI_CORE && NET && (VT || SERIAL_CONSOLE) -+ default y -+ ---help--- -+ This option enabled support for a userspace based user interface -+ to TuxOnIce, which allows you to have a nice display while suspending -+ and resuming, and also enables features such as pressing escape to -+ cancel a cycle or interactive debugging. -+ -+ config TOI_USERUI_DEFAULT_PATH -+ string "Default userui program location" -+ default "/usr/local/sbin/tuxoniceui_text" -+ depends on TOI_USERUI -+ ---help--- -+ This entry allows you to specify a default path to the userui binary. -+ -+ config TOI_KEEP_IMAGE -+ bool "Allow Keep Image Mode" -+ depends on TOI_CORE -+ ---help--- -+ This option allows you to keep and image and reuse it. It is intended -+ __ONLY__ for use with systems where all filesystems are mounted read- -+ only (kiosks, for example). To use it, compile this option in and boot -+ normally. Set the KEEP_IMAGE flag in /sys/power/tuxonice and suspend. -+ When you resume, the image will not be removed. You will be unable to turn -+ off swap partitions (assuming you are using the swap allocator), but future -+ suspends simply do a power-down. The image can be updated using the -+ kernel command line parameter suspend_act= to turn off the keep image -+ bit. Keep image mode is a little less user friendly on purpose - it -+ should not be used without thought! -+ -+ config TOI_REPLACE_SWSUSP -+ bool "Replace swsusp by default" -+ default y -+ depends on TOI_CORE -+ ---help--- -+ TuxOnIce can replace swsusp. This option makes that the default state, -+ requiring you to echo 0 > /sys/power/tuxonice/replace_swsusp if you want -+ to use the vanilla kernel functionality. Note that your initrd/ramfs will -+ need to do this before trying to resume, too. -+ With overriding swsusp enabled, echoing disk to /sys/power/state will -+ start a TuxOnIce cycle. If resume= doesn't specify an allocator and both -+ the swap and file allocators are compiled in, the swap allocator will be -+ used by default. -+ -+ config TOI_IGNORE_LATE_INITCALL -+ bool "Wait for initrd/ramfs to run, by default" -+ default n -+ depends on TOI_CORE -+ ---help--- -+ When booting, TuxOnIce can check for an image and start to resume prior -+ to any initrd/ramfs running (via a late initcall). -+ -+ If you don't have an initrd/ramfs, this is what you want to happen - -+ otherwise you won't be able to safely resume. You should set this option -+ to 'No'. -+ -+ If, however, you want your initrd/ramfs to run anyway before resuming, -+ you need to tell TuxOnIce to ignore that earlier opportunity to resume. -+ This can be done either by using this compile time option, or by -+ overriding this option with the boot-time parameter toi_initramfs_resume_only=1. -+ -+ Note that if TuxOnIce can't resume at the earlier opportunity, the -+ value of this option won't matter - the initramfs/initrd (if any) will -+ run anyway. -+ -+ menuconfig TOI_CLUSTER -+ tristate "Cluster support" -+ default n -+ depends on TOI_CORE && NET && BROKEN -+ ---help--- -+ Support for linking multiple machines in a cluster so that they suspend -+ and resume together. -+ -+ config TOI_DEFAULT_CLUSTER_INTERFACE -+ string "Default cluster interface" -+ depends on TOI_CLUSTER -+ ---help--- -+ The default interface on which to communicate with other nodes in -+ the cluster. -+ -+ If no value is set here, cluster support will be disabled by default. -+ -+ config TOI_DEFAULT_CLUSTER_KEY -+ string "Default cluster key" -+ default "Default" -+ depends on TOI_CLUSTER -+ ---help--- -+ The default key used by this node. All nodes in the same cluster -+ have the same key. Multiple clusters may coexist on the same lan -+ by using different values for this key. -+ -+ config TOI_CLUSTER_IMAGE_TIMEOUT -+ int "Timeout when checking for image" -+ default 15 -+ depends on TOI_CLUSTER -+ ---help--- -+ Timeout (seconds) before continuing to boot when waiting to see -+ whether other nodes might have an image. Set to -1 to wait -+ indefinitely. In WAIT_UNTIL_NODES is non zero, we might continue -+ booting sooner than this timeout. -+ -+ config TOI_CLUSTER_WAIT_UNTIL_NODES -+ int "Nodes without image before continuing" -+ default 0 -+ depends on TOI_CLUSTER -+ ---help--- -+ When booting and no image is found, we wait to see if other nodes -+ have an image before continuing to boot. This value lets us -+ continue after seeing a certain number of nodes without an image, -+ instead of continuing to wait for the timeout. Set to 0 to only -+ use the timeout. -+ -+ config TOI_DEFAULT_CLUSTER_PRE_HIBERNATE -+ string "Default pre-hibernate script" -+ depends on TOI_CLUSTER -+ ---help--- -+ The default script to be called when starting to hibernate. -+ -+ config TOI_DEFAULT_CLUSTER_POST_HIBERNATE -+ string "Default post-hibernate script" -+ depends on TOI_CLUSTER -+ ---help--- -+ The default script to be called after resuming from hibernation. -+ -+ config TOI_DEFAULT_WAIT -+ int "Default waiting time for emergency boot messages" -+ default "25" -+ range -1 32768 -+ depends on TOI_CORE -+ help -+ TuxOnIce can display warnings very early in the process of resuming, -+ if (for example) it appears that you have booted a kernel that doesn't -+ match an image on disk. It can then give you the opportunity to either -+ continue booting that kernel, or reboot the machine. This option can be -+ used to control how long to wait in such circumstances. -1 means wait -+ forever. 0 means don't wait at all (do the default action, which will -+ generally be to continue booting and remove the image). Values of 1 or -+ more indicate a number of seconds (up to 255) to wait before doing the -+ default. -+ -+ config TOI_DEFAULT_EXTRA_PAGES_ALLOWANCE -+ int "Default extra pages allowance" -+ default "2000" -+ range 500 32768 -+ depends on TOI_CORE -+ help -+ This value controls the default for the allowance TuxOnIce makes for -+ drivers to allocate extra memory during the atomic copy. The default -+ value of 2000 will be okay in most cases. If you are using -+ DRI, the easiest way to find what value to use is to try to hibernate -+ and look at how many pages were actually needed in the sysfs entry -+ /sys/power/tuxonice/debug_info (first number on the last line), adding -+ a little extra because the value is not always the same. -+ -+ config TOI_CHECKSUM -+ bool "Checksum pageset2" -+ default n -+ depends on TOI_CORE -+ select CRYPTO -+ select CRYPTO_ALGAPI -+ select CRYPTO_MD4 -+ ---help--- -+ Adds support for checksumming pageset2 pages, to ensure you really get an -+ atomic copy. Since some filesystems (XFS especially) change metadata even -+ when there's no other activity, we need this to check for pages that have -+ been changed while we were saving the page cache. If your debugging output -+ always says no pages were resaved, you may be able to safely disable this -+ option. -+ -+config TOI -+ bool -+ depends on TOI_CORE!=n -+ default y -+ -+config TOI_EXPORTS -+ bool -+ depends on TOI_SWAP=m || TOI_FILE=m || \ -+ TOI_CRYPTO=m || TOI_CLUSTER=m || \ -+ TOI_USERUI=m || TOI_CORE=m -+ default y -+ - config APM_EMULATION - tristate "Advanced Power Management Emulation" - depends on PM && SYS_SUPPORTS_APM_EMULATION -diff --git a/kernel/power/Makefile b/kernel/power/Makefile -index 4319181..18c4733 100644 ---- a/kernel/power/Makefile -+++ b/kernel/power/Makefile -@@ -3,6 +3,35 @@ ifeq ($(CONFIG_PM_DEBUG),y) - EXTRA_CFLAGS += -DDEBUG - endif - -+tuxonice_core-y := tuxonice_modules.o -+ -+obj-$(CONFIG_TOI) += tuxonice_builtin.o -+ -+tuxonice_core-$(CONFIG_PM_DEBUG) += tuxonice_alloc.o -+ -+# Compile these in after allocation debugging, if used. -+ -+tuxonice_core-y += tuxonice_sysfs.o tuxonice_highlevel.o \ -+ tuxonice_io.o tuxonice_pagedir.o tuxonice_prepare_image.o \ -+ tuxonice_extent.o tuxonice_pageflags.o tuxonice_ui.o \ -+ tuxonice_power_off.o tuxonice_atomic_copy.o -+ -+tuxonice_core-$(CONFIG_TOI_CHECKSUM) += tuxonice_checksum.o -+ -+tuxonice_core-$(CONFIG_NET) += tuxonice_storage.o tuxonice_netlink.o -+ -+obj-$(CONFIG_TOI_CORE) += tuxonice_core.o -+obj-$(CONFIG_TOI_CRYPTO) += tuxonice_compress.o -+ -+tuxonice_bio-y := tuxonice_bio_core.o tuxonice_bio_chains.o \ -+ tuxonice_bio_signature.o -+ -+obj-$(CONFIG_TOI_SWAP) += tuxonice_bio.o tuxonice_swap.o -+obj-$(CONFIG_TOI_FILE) += tuxonice_bio.o tuxonice_file.o -+obj-$(CONFIG_TOI_CLUSTER) += tuxonice_cluster.o -+ -+obj-$(CONFIG_TOI_USERUI) += tuxonice_userui.o -+ - obj-$(CONFIG_PM) += main.o - obj-$(CONFIG_PM_SLEEP) += console.o - obj-$(CONFIG_FREEZER) += process.o -diff --git a/kernel/power/console.c b/kernel/power/console.c -index 218e5af..95a6bdc 100644 ---- a/kernel/power/console.c -+++ b/kernel/power/console.c -@@ -24,6 +24,7 @@ int pm_prepare_console(void) - orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE); - return 0; - } -+EXPORT_SYMBOL_GPL(pm_prepare_console); - - void pm_restore_console(void) - { -@@ -32,4 +33,5 @@ void pm_restore_console(void) - vt_kmsg_redirect(orig_kmsg); - } - } -+EXPORT_SYMBOL_GPL(pm_restore_console); - #endif -diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c -index bbfe472..07c550b 100644 ---- a/kernel/power/hibernate.c -+++ b/kernel/power/hibernate.c -@@ -25,11 +25,12 @@ - #include - #include - --#include "power.h" -+#include "tuxonice.h" - - - static int noresume = 0; --static char resume_file[256] = CONFIG_PM_STD_PARTITION; -+char resume_file[256] = CONFIG_PM_STD_PARTITION; -+EXPORT_SYMBOL_GPL(resume_file); - dev_t swsusp_resume_device; - sector_t swsusp_resume_block; - int in_suspend __nosavedata = 0; -@@ -116,55 +117,60 @@ static int hibernation_test(int level) { return 0; } - * hibernation - */ - --static int platform_begin(int platform_mode) -+int platform_begin(int platform_mode) - { - return (platform_mode && hibernation_ops) ? - hibernation_ops->begin() : 0; - } -+EXPORT_SYMBOL_GPL(platform_begin); - - /** - * platform_end - tell the platform driver that we've entered the - * working state - */ - --static void platform_end(int platform_mode) -+void platform_end(int platform_mode) - { - if (platform_mode && hibernation_ops) - hibernation_ops->end(); - } -+EXPORT_SYMBOL_GPL(platform_end); - - /** - * platform_pre_snapshot - prepare the machine for hibernation using the - * platform driver if so configured and return an error code if it fails - */ - --static int platform_pre_snapshot(int platform_mode) -+int platform_pre_snapshot(int platform_mode) - { - return (platform_mode && hibernation_ops) ? - hibernation_ops->pre_snapshot() : 0; - } -+EXPORT_SYMBOL_GPL(platform_pre_snapshot); - - /** - * platform_leave - prepare the machine for switching to the normal mode - * of operation using the platform driver (called with interrupts disabled) - */ - --static void platform_leave(int platform_mode) -+void platform_leave(int platform_mode) - { - if (platform_mode && hibernation_ops) - hibernation_ops->leave(); - } -+EXPORT_SYMBOL_GPL(platform_leave); - - /** - * platform_finish - switch the machine to the normal mode of operation - * using the platform driver (must be called after platform_prepare()) - */ - --static void platform_finish(int platform_mode) -+void platform_finish(int platform_mode) - { - if (platform_mode && hibernation_ops) - hibernation_ops->finish(); - } -+EXPORT_SYMBOL_GPL(platform_finish); - - /** - * platform_pre_restore - prepare the platform for the restoration from a -@@ -172,11 +178,12 @@ static void platform_finish(int platform_mode) - * called, platform_restore_cleanup() must be called. - */ - --static int platform_pre_restore(int platform_mode) -+int platform_pre_restore(int platform_mode) - { - return (platform_mode && hibernation_ops) ? - hibernation_ops->pre_restore() : 0; - } -+EXPORT_SYMBOL_GPL(platform_pre_restore); - - /** - * platform_restore_cleanup - switch the platform to the normal mode of -@@ -185,22 +192,24 @@ static int platform_pre_restore(int platform_mode) - * regardless of the result of platform_pre_restore(). - */ - --static void platform_restore_cleanup(int platform_mode) -+void platform_restore_cleanup(int platform_mode) - { - if (platform_mode && hibernation_ops) - hibernation_ops->restore_cleanup(); - } -+EXPORT_SYMBOL_GPL(platform_restore_cleanup); - - /** - * platform_recover - recover the platform from a failure to suspend - * devices. - */ - --static void platform_recover(int platform_mode) -+void platform_recover(int platform_mode) - { - if (platform_mode && hibernation_ops && hibernation_ops->recover) - hibernation_ops->recover(); - } -+EXPORT_SYMBOL_GPL(platform_recover); - - /** - * swsusp_show_speed - print the time elapsed between two events. -@@ -525,6 +534,7 @@ int hibernation_platform_enter(void) - - return error; - } -+EXPORT_SYMBOL_GPL(hibernation_platform_enter); - - /** - * power_down - Shut the machine down for hibernation. -@@ -576,6 +586,9 @@ int hibernate(void) - { - int error; - -+ if (test_action_state(TOI_REPLACE_SWSUSP)) -+ return try_tuxonice_hibernate(); -+ - mutex_lock(&pm_mutex); - /* The snapshot device should not be opened while we're running */ - if (!atomic_add_unless(&snapshot_device_available, -1, 0)) { -@@ -656,11 +669,19 @@ int hibernate(void) - * - */ - --static int software_resume(void) -+int software_resume(void) - { - int error; - unsigned int flags; - -+ resume_attempted = 1; -+ -+ /* -+ * We can't know (until an image header - if any - is loaded), whether -+ * we did override swsusp. We therefore ensure that both are tried. -+ */ -+ try_tuxonice_resume(); -+ - /* - * If the user said "noresume".. bail out early. - */ -@@ -989,6 +1010,7 @@ static int __init resume_offset_setup(char *str) - static int __init noresume_setup(char *str) - { - noresume = 1; -+ set_toi_state(TOI_NORESUME_SPECIFIED); - return 1; - } - -diff --git a/kernel/power/main.c b/kernel/power/main.c -index 0998c71..9509733 100644 ---- a/kernel/power/main.c -+++ b/kernel/power/main.c -@@ -16,6 +16,7 @@ - #include "power.h" - - DEFINE_MUTEX(pm_mutex); -+EXPORT_SYMBOL_GPL(pm_mutex); - - unsigned int pm_flags; - EXPORT_SYMBOL(pm_flags); -@@ -24,7 +25,8 @@ EXPORT_SYMBOL(pm_flags); - - /* Routines for PM-transition notifications */ - --static BLOCKING_NOTIFIER_HEAD(pm_chain_head); -+BLOCKING_NOTIFIER_HEAD(pm_chain_head); -+EXPORT_SYMBOL_GPL(pm_chain_head); - - int register_pm_notifier(struct notifier_block *nb) - { -@@ -43,6 +45,7 @@ int pm_notifier_call_chain(unsigned long val) - return (blocking_notifier_call_chain(&pm_chain_head, val, NULL) - == NOTIFY_BAD) ? -EINVAL : 0; - } -+EXPORT_SYMBOL_GPL(pm_notifier_call_chain); - - #ifdef CONFIG_PM_DEBUG - int pm_test_level = TEST_NONE; -@@ -110,6 +113,7 @@ power_attr(pm_test); - #endif /* CONFIG_PM_SLEEP */ - - struct kobject *power_kobj; -+EXPORT_SYMBOL_GPL(power_kobj); - - /** - * state - control system power state. -diff --git a/kernel/power/power.h b/kernel/power/power.h -index 46c5a26..d8c8f32 100644 ---- a/kernel/power/power.h -+++ b/kernel/power/power.h -@@ -31,8 +31,12 @@ static inline char *check_image_kernel(struct swsusp_info *info) - return arch_hibernation_header_restore(info) ? - "architecture specific data" : NULL; - } -+#else -+extern char *check_image_kernel(struct swsusp_info *info); - #endif /* CONFIG_ARCH_HIBERNATION_HEADER */ -+extern int init_header(struct swsusp_info *info); - -+extern char resume_file[256]; - /* - * Keep some memory free so that I/O operations can succeed without paging - * [Might this be more than 4 MB?] -@@ -49,6 +53,7 @@ static inline char *check_image_kernel(struct swsusp_info *info) - extern int hibernation_snapshot(int platform_mode); - extern int hibernation_restore(int platform_mode); - extern int hibernation_platform_enter(void); -+extern void platform_recover(int platform_mode); - #endif - - extern int pfn_is_nosave(unsigned long); -@@ -63,6 +68,8 @@ static struct kobj_attribute _name##_attr = { \ - .store = _name##_store, \ - } - -+extern struct pbe *restore_pblist; -+ - /* Preferred image size in bytes (default 500 MB) */ - extern unsigned long image_size; - extern int in_suspend; -@@ -236,3 +243,86 @@ static inline void suspend_thaw_processes(void) - { - } - #endif -+ -+extern struct page *saveable_page(struct zone *z, unsigned long p); -+#ifdef CONFIG_HIGHMEM -+extern struct page *saveable_highmem_page(struct zone *z, unsigned long p); -+#else -+static -+inline struct page *saveable_highmem_page(struct zone *z, unsigned long p) -+{ -+ return NULL; -+} -+#endif -+ -+#define PBES_PER_PAGE (PAGE_SIZE / sizeof(struct pbe)) -+extern struct list_head nosave_regions; -+ -+/** -+ * This structure represents a range of page frames the contents of which -+ * should not be saved during the suspend. -+ */ -+ -+struct nosave_region { -+ struct list_head list; -+ unsigned long start_pfn; -+ unsigned long end_pfn; -+}; -+ -+#ifndef PHYS_PFN_OFFSET -+#define PHYS_PFN_OFFSET 0 -+#endif -+ -+#define ZONE_START(thiszone) ((thiszone)->zone_start_pfn - PHYS_PFN_OFFSET) -+ -+#define BM_END_OF_MAP (~0UL) -+ -+#define BM_BITS_PER_BLOCK (PAGE_SIZE * BITS_PER_BYTE) -+ -+struct bm_block { -+ struct list_head hook; /* hook into a list of bitmap blocks */ -+ unsigned long start_pfn; /* pfn represented by the first bit */ -+ unsigned long end_pfn; /* pfn represented by the last bit plus 1 */ -+ unsigned long *data; /* bitmap representing pages */ -+}; -+ -+/* struct bm_position is used for browsing memory bitmaps */ -+ -+struct bm_position { -+ struct bm_block *block; -+ int bit; -+}; -+ -+struct memory_bitmap { -+ struct list_head blocks; /* list of bitmap blocks */ -+ struct linked_page *p_list; /* list of pages used to store zone -+ * bitmap objects and bitmap block -+ * objects -+ */ -+ struct bm_position cur; /* most recently used bit position */ -+ struct bm_position iter; /* most recently used bit position -+ * when iterating over a bitmap. -+ */ -+}; -+ -+extern int memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask, -+ int safe_needed); -+extern void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free); -+extern void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn); -+extern void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn); -+extern int memory_bm_test_bit(struct memory_bitmap *bm, unsigned long pfn); -+extern unsigned long memory_bm_next_pfn(struct memory_bitmap *bm); -+extern void memory_bm_position_reset(struct memory_bitmap *bm); -+extern void memory_bm_clear(struct memory_bitmap *bm); -+extern void memory_bm_copy(struct memory_bitmap *source, -+ struct memory_bitmap *dest); -+extern void memory_bm_dup(struct memory_bitmap *source, -+ struct memory_bitmap *dest); -+ -+#ifdef CONFIG_TOI -+struct toi_module_ops; -+extern int memory_bm_read(struct memory_bitmap *bm, int (*rw_chunk) -+ (int rw, struct toi_module_ops *owner, char *buffer, int buffer_size)); -+extern int memory_bm_write(struct memory_bitmap *bm, int (*rw_chunk) -+ (int rw, struct toi_module_ops *owner, char *buffer, int buffer_size)); -+#endif -diff --git a/kernel/power/process.c b/kernel/power/process.c -index 5ade1bd..e24a702 100644 ---- a/kernel/power/process.c -+++ b/kernel/power/process.c -@@ -15,6 +15,13 @@ - #include - #include - #include -+#include -+ -+int freezer_state; -+EXPORT_SYMBOL_GPL(freezer_state); -+ -+int freezer_sync; -+EXPORT_SYMBOL_GPL(freezer_sync); - - /* - * Timeout for stopping processes -@@ -93,7 +100,8 @@ static int try_to_freeze_tasks(bool sig_only) - do_each_thread(g, p) { - task_lock(p); - if (freezing(p) && !freezer_should_skip(p)) -- printk(KERN_ERR " %s\n", p->comm); -+ printk(KERN_ERR " %s (%d) failed to freeze.\n", -+ p->comm, p->pid); - cancel_freezing(p); - task_unlock(p); - } while_each_thread(g, p); -@@ -113,17 +121,26 @@ int freeze_processes(void) - { - int error; - -- printk("Freezing user space processes ... "); -+ printk(KERN_INFO "Stopping fuse filesystems.\n"); -+ freeze_filesystems(FS_FREEZER_FUSE); -+ freezer_state = FREEZER_FILESYSTEMS_FROZEN; -+ printk(KERN_INFO "Freezing user space processes ... "); - error = try_to_freeze_tasks(true); - if (error) - goto Exit; - printk("done.\n"); - -- printk("Freezing remaining freezable tasks ... "); -+ if (freezer_sync) -+ sys_sync(); -+ printk(KERN_INFO "Stopping normal filesystems.\n"); -+ freeze_filesystems(FS_FREEZER_NORMAL); -+ freezer_state = FREEZER_USERSPACE_FROZEN; -+ printk(KERN_INFO "Freezing remaining freezable tasks ... "); - error = try_to_freeze_tasks(false); - if (error) - goto Exit; - printk("done."); -+ freezer_state = FREEZER_FULLY_ON; - - oom_killer_disable(); - Exit: -@@ -132,6 +149,7 @@ int freeze_processes(void) - - return error; - } -+EXPORT_SYMBOL_GPL(freeze_processes); - - static void thaw_tasks(bool nosig_only) - { -@@ -155,12 +173,39 @@ static void thaw_tasks(bool nosig_only) - - void thaw_processes(void) - { -+ int old_state = freezer_state; -+ -+ if (old_state == FREEZER_OFF) -+ return; -+ -+ freezer_state = FREEZER_OFF; -+ - oom_killer_enable(); - -+ printk(KERN_INFO "Restarting all filesystems ...\n"); -+ thaw_filesystems(FS_FREEZER_ALL); -+ -+ printk(KERN_INFO "Restarting tasks ... "); -+ if (old_state == FREEZER_FULLY_ON) -+ thaw_tasks(true); -+ - printk("Restarting tasks ... "); -- thaw_tasks(true); - thaw_tasks(false); - schedule(); - printk("done.\n"); - } -+EXPORT_SYMBOL_GPL(thaw_processes); - -+void thaw_kernel_threads(void) -+{ -+ freezer_state = FREEZER_USERSPACE_FROZEN; -+ printk(KERN_INFO "Restarting normal filesystems.\n"); -+ thaw_filesystems(FS_FREEZER_NORMAL); -+ thaw_tasks(true); -+} -+ -+/* -+ * It's ugly putting this EXPORT down here, but it's necessary so that it -+ * doesn't matter whether the fs-freezing patch is applied or not. -+ */ -+EXPORT_SYMBOL_GPL(thaw_kernel_threads); -diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c -index 36cb168..7f6da8f 100644 ---- a/kernel/power/snapshot.c -+++ b/kernel/power/snapshot.c -@@ -34,6 +34,8 @@ - #include - - #include "power.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_pagedir.h" - - static int swsusp_page_is_free(struct page *); - static void swsusp_set_page_forbidden(struct page *); -@@ -53,6 +55,10 @@ unsigned long image_size = 500 * 1024 * 1024; - * directly to their "original" page frames. - */ - struct pbe *restore_pblist; -+EXPORT_SYMBOL_GPL(restore_pblist); -+ -+int resume_attempted; -+EXPORT_SYMBOL_GPL(resume_attempted); - - /* Pointer to an auxiliary buffer (1 page) */ - static void *buffer; -@@ -95,6 +101,9 @@ static void *get_image_page(gfp_t gfp_mask, int safe_needed) - - unsigned long get_safe_page(gfp_t gfp_mask) - { -+ if (toi_running) -+ return toi_get_nonconflicting_page(); -+ - return (unsigned long)get_image_page(gfp_mask, PG_SAFE); - } - -@@ -231,47 +240,22 @@ static void *chain_alloc(struct chain_allocator *ca, unsigned int size) - * the represented memory area. - */ - --#define BM_END_OF_MAP (~0UL) -- --#define BM_BITS_PER_BLOCK (PAGE_SIZE * BITS_PER_BYTE) -- --struct bm_block { -- struct list_head hook; /* hook into a list of bitmap blocks */ -- unsigned long start_pfn; /* pfn represented by the first bit */ -- unsigned long end_pfn; /* pfn represented by the last bit plus 1 */ -- unsigned long *data; /* bitmap representing pages */ --}; -- - static inline unsigned long bm_block_bits(struct bm_block *bb) - { - return bb->end_pfn - bb->start_pfn; - } - --/* strcut bm_position is used for browsing memory bitmaps */ -- --struct bm_position { -- struct bm_block *block; -- int bit; --}; -- --struct memory_bitmap { -- struct list_head blocks; /* list of bitmap blocks */ -- struct linked_page *p_list; /* list of pages used to store zone -- * bitmap objects and bitmap block -- * objects -- */ -- struct bm_position cur; /* most recently used bit position */ --}; -- - /* Functions that operate on memory bitmaps */ - --static void memory_bm_position_reset(struct memory_bitmap *bm) -+void memory_bm_position_reset(struct memory_bitmap *bm) - { - bm->cur.block = list_entry(bm->blocks.next, struct bm_block, hook); - bm->cur.bit = 0; --} - --static void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free); -+ bm->iter.block = list_entry(bm->blocks.next, struct bm_block, hook); -+ bm->iter.bit = 0; -+} -+EXPORT_SYMBOL_GPL(memory_bm_position_reset); - - /** - * create_bm_block_list - create a list of block bitmap objects -@@ -379,7 +363,7 @@ static int create_mem_extents(struct list_head *list, gfp_t gfp_mask) - /** - * memory_bm_create - allocate memory for a memory bitmap - */ --static int -+int - memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask, int safe_needed) - { - struct chain_allocator ca; -@@ -435,11 +419,12 @@ memory_bm_create(struct memory_bitmap *bm, gfp_t gfp_mask, int safe_needed) - memory_bm_free(bm, PG_UNSAFE_CLEAR); - goto Exit; - } -+EXPORT_SYMBOL_GPL(memory_bm_create); - - /** - * memory_bm_free - free memory occupied by the memory bitmap @bm - */ --static void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free) -+void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free) - { - struct bm_block *bb; - -@@ -451,6 +436,7 @@ static void memory_bm_free(struct memory_bitmap *bm, int clear_nosave_free) - - INIT_LIST_HEAD(&bm->blocks); - } -+EXPORT_SYMBOL_GPL(memory_bm_free); - - /** - * memory_bm_find_bit - find the bit in the bitmap @bm that corresponds -@@ -489,7 +475,7 @@ static int memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn, - return 0; - } - --static void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn) -+void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn) - { - void *addr; - unsigned int bit; -@@ -499,6 +485,7 @@ static void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn) - BUG_ON(error); - set_bit(bit, addr); - } -+EXPORT_SYMBOL_GPL(memory_bm_set_bit); - - static int mem_bm_set_bit_check(struct memory_bitmap *bm, unsigned long pfn) - { -@@ -512,7 +499,7 @@ static int mem_bm_set_bit_check(struct memory_bitmap *bm, unsigned long pfn) - return error; - } - --static void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn) -+void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn) - { - void *addr; - unsigned int bit; -@@ -522,8 +509,9 @@ static void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn) - BUG_ON(error); - clear_bit(bit, addr); - } -+EXPORT_SYMBOL_GPL(memory_bm_clear_bit); - --static int memory_bm_test_bit(struct memory_bitmap *bm, unsigned long pfn) -+int memory_bm_test_bit(struct memory_bitmap *bm, unsigned long pfn) - { - void *addr; - unsigned int bit; -@@ -533,6 +521,7 @@ static int memory_bm_test_bit(struct memory_bitmap *bm, unsigned long pfn) - BUG_ON(error); - return test_bit(bit, addr); - } -+EXPORT_SYMBOL_GPL(memory_bm_test_bit); - - static bool memory_bm_pfn_present(struct memory_bitmap *bm, unsigned long pfn) - { -@@ -551,43 +540,178 @@ static bool memory_bm_pfn_present(struct memory_bitmap *bm, unsigned long pfn) - * this function. - */ - --static unsigned long memory_bm_next_pfn(struct memory_bitmap *bm) -+unsigned long memory_bm_next_pfn(struct memory_bitmap *bm) - { - struct bm_block *bb; - int bit; - -- bb = bm->cur.block; -+ bb = bm->iter.block; - do { -- bit = bm->cur.bit; -+ bit = bm->iter.bit; - bit = find_next_bit(bb->data, bm_block_bits(bb), bit); - if (bit < bm_block_bits(bb)) - goto Return_pfn; - - bb = list_entry(bb->hook.next, struct bm_block, hook); -- bm->cur.block = bb; -- bm->cur.bit = 0; -+ bm->iter.block = bb; -+ bm->iter.bit = 0; - } while (&bb->hook != &bm->blocks); - - memory_bm_position_reset(bm); - return BM_END_OF_MAP; - - Return_pfn: -- bm->cur.bit = bit + 1; -+ bm->iter.bit = bit + 1; - return bb->start_pfn + bit; - } -+EXPORT_SYMBOL_GPL(memory_bm_next_pfn); - --/** -- * This structure represents a range of page frames the contents of which -- * should not be saved during the suspend. -- */ -+void memory_bm_clear(struct memory_bitmap *bm) -+{ -+ unsigned long pfn; - --struct nosave_region { -- struct list_head list; -- unsigned long start_pfn; -- unsigned long end_pfn; --}; -+ memory_bm_position_reset(bm); -+ pfn = memory_bm_next_pfn(bm); -+ while (pfn != BM_END_OF_MAP) { -+ memory_bm_clear_bit(bm, pfn); -+ pfn = memory_bm_next_pfn(bm); -+ } -+} -+EXPORT_SYMBOL_GPL(memory_bm_clear); -+ -+void memory_bm_copy(struct memory_bitmap *source, struct memory_bitmap *dest) -+{ -+ unsigned long pfn; -+ -+ memory_bm_position_reset(source); -+ pfn = memory_bm_next_pfn(source); -+ while (pfn != BM_END_OF_MAP) { -+ memory_bm_set_bit(dest, pfn); -+ pfn = memory_bm_next_pfn(source); -+ } -+} -+EXPORT_SYMBOL_GPL(memory_bm_copy); -+ -+void memory_bm_dup(struct memory_bitmap *source, struct memory_bitmap *dest) -+{ -+ memory_bm_clear(dest); -+ memory_bm_copy(source, dest); -+} -+EXPORT_SYMBOL_GPL(memory_bm_dup); -+ -+#ifdef CONFIG_TOI -+#define DEFINE_MEMORY_BITMAP(name) \ -+struct memory_bitmap *name; \ -+EXPORT_SYMBOL_GPL(name) -+ -+DEFINE_MEMORY_BITMAP(pageset1_map); -+DEFINE_MEMORY_BITMAP(pageset1_copy_map); -+DEFINE_MEMORY_BITMAP(pageset2_map); -+DEFINE_MEMORY_BITMAP(page_resave_map); -+DEFINE_MEMORY_BITMAP(io_map); -+DEFINE_MEMORY_BITMAP(nosave_map); -+DEFINE_MEMORY_BITMAP(free_map); -+ -+int memory_bm_write(struct memory_bitmap *bm, int (*rw_chunk) -+ (int rw, struct toi_module_ops *owner, char *buffer, int buffer_size)) -+{ -+ int result = 0; -+ unsigned int nr = 0; -+ struct bm_block *bb; -+ -+ if (!bm) -+ return result; - --static LIST_HEAD(nosave_regions); -+ list_for_each_entry(bb, &bm->blocks, hook) -+ nr++; -+ -+ result = (*rw_chunk)(WRITE, NULL, (char *) &nr, sizeof(unsigned int)); -+ if (result) -+ return result; -+ -+ list_for_each_entry(bb, &bm->blocks, hook) { -+ result = (*rw_chunk)(WRITE, NULL, (char *) &bb->start_pfn, -+ 2 * sizeof(unsigned long)); -+ if (result) -+ return result; -+ -+ result = (*rw_chunk)(WRITE, NULL, (char *) bb->data, PAGE_SIZE); -+ if (result) -+ return result; -+ } -+ -+ return 0; -+} -+EXPORT_SYMBOL_GPL(memory_bm_write); -+ -+int memory_bm_read(struct memory_bitmap *bm, int (*rw_chunk) -+ (int rw, struct toi_module_ops *owner, char *buffer, int buffer_size)) -+{ -+ int result = 0; -+ unsigned int nr, i; -+ struct bm_block *bb; -+ -+ if (!bm) -+ return result; -+ -+ result = memory_bm_create(bm, GFP_KERNEL, 0); -+ -+ if (result) -+ return result; -+ -+ result = (*rw_chunk)(READ, NULL, (char *) &nr, sizeof(unsigned int)); -+ if (result) -+ goto Free; -+ -+ for (i = 0; i < nr; i++) { -+ unsigned long pfn; -+ -+ result = (*rw_chunk)(READ, NULL, (char *) &pfn, -+ sizeof(unsigned long)); -+ if (result) -+ goto Free; -+ -+ list_for_each_entry(bb, &bm->blocks, hook) -+ if (bb->start_pfn == pfn) -+ break; -+ -+ if (&bb->hook == &bm->blocks) { -+ printk(KERN_ERR -+ "TuxOnIce: Failed to load memory bitmap.\n"); -+ result = -EINVAL; -+ goto Free; -+ } -+ -+ result = (*rw_chunk)(READ, NULL, (char *) &pfn, -+ sizeof(unsigned long)); -+ if (result) -+ goto Free; -+ -+ if (pfn != bb->end_pfn) { -+ printk(KERN_ERR -+ "TuxOnIce: Failed to load memory bitmap. " -+ "End PFN doesn't match what was saved.\n"); -+ result = -EINVAL; -+ goto Free; -+ } -+ -+ result = (*rw_chunk)(READ, NULL, (char *) bb->data, PAGE_SIZE); -+ -+ if (result) -+ goto Free; -+ } -+ -+ return 0; -+ -+Free: -+ memory_bm_free(bm, PG_ANY); -+ return result; -+} -+EXPORT_SYMBOL_GPL(memory_bm_read); -+#endif -+ -+LIST_HEAD(nosave_regions); -+EXPORT_SYMBOL_GPL(nosave_regions); - - /** - * register_nosave_region - register a range of page frames the contents -@@ -823,7 +947,7 @@ static unsigned int count_free_highmem_pages(void) - * We should save the page if it isn't Nosave or NosaveFree, or Reserved, - * and it isn't a part of a free chunk of pages. - */ --static struct page *saveable_highmem_page(struct zone *zone, unsigned long pfn) -+struct page *saveable_highmem_page(struct zone *zone, unsigned long pfn) - { - struct page *page; - -@@ -842,6 +966,7 @@ static struct page *saveable_highmem_page(struct zone *zone, unsigned long pfn) - - return page; - } -+EXPORT_SYMBOL_GPL(saveable_highmem_page); - - /** - * count_highmem_pages - compute the total number of saveable highmem -@@ -867,11 +992,6 @@ static unsigned int count_highmem_pages(void) - } - return n; - } --#else --static inline void *saveable_highmem_page(struct zone *z, unsigned long p) --{ -- return NULL; --} - #endif /* CONFIG_HIGHMEM */ - - /** -@@ -882,7 +1002,7 @@ static inline void *saveable_highmem_page(struct zone *z, unsigned long p) - * of pages statically defined as 'unsaveable', and it isn't a part of - * a free chunk of pages. - */ --static struct page *saveable_page(struct zone *zone, unsigned long pfn) -+struct page *saveable_page(struct zone *zone, unsigned long pfn) - { - struct page *page; - -@@ -904,6 +1024,7 @@ static struct page *saveable_page(struct zone *zone, unsigned long pfn) - - return page; - } -+EXPORT_SYMBOL_GPL(saveable_page); - - /** - * count_data_pages - compute the total number of saveable non-highmem -@@ -1500,6 +1621,9 @@ asmlinkage int swsusp_save(void) - { - unsigned int nr_pages, nr_highmem; - -+ if (toi_running) -+ return toi_post_context_save(); -+ - printk(KERN_INFO "PM: Creating hibernation image: \n"); - - drain_local_pages(NULL); -@@ -1540,14 +1664,14 @@ asmlinkage int swsusp_save(void) - } - - #ifndef CONFIG_ARCH_HIBERNATION_HEADER --static int init_header_complete(struct swsusp_info *info) -+int init_header_complete(struct swsusp_info *info) - { - memcpy(&info->uts, init_utsname(), sizeof(struct new_utsname)); - info->version_code = LINUX_VERSION_CODE; - return 0; - } - --static char *check_image_kernel(struct swsusp_info *info) -+char *check_image_kernel(struct swsusp_info *info) - { - if (info->version_code != LINUX_VERSION_CODE) - return "kernel version"; -@@ -1561,6 +1685,7 @@ static char *check_image_kernel(struct swsusp_info *info) - return "machine"; - return NULL; - } -+EXPORT_SYMBOL_GPL(check_image_kernel); - #endif /* CONFIG_ARCH_HIBERNATION_HEADER */ - - unsigned long snapshot_get_image_size(void) -@@ -1568,7 +1693,7 @@ unsigned long snapshot_get_image_size(void) - return nr_copy_pages + nr_meta_pages + 1; - } - --static int init_header(struct swsusp_info *info) -+int init_header(struct swsusp_info *info) - { - memset(info, 0, sizeof(struct swsusp_info)); - info->num_physpages = num_physpages; -@@ -1578,6 +1703,7 @@ static int init_header(struct swsusp_info *info) - info->size <<= PAGE_SHIFT; - return init_header_complete(info); - } -+EXPORT_SYMBOL_GPL(init_header); - - /** - * pack_pfns - pfns corresponding to the set bits found in the bitmap @bm -diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c -index 6f10dfc..cecd9a8 100644 ---- a/kernel/power/suspend.c -+++ b/kernel/power/suspend.c -@@ -226,6 +226,7 @@ int suspend_devices_and_enter(suspend_state_t state) - suspend_ops->recover(); - goto Resume_devices; - } -+EXPORT_SYMBOL_GPL(suspend_devices_and_enter); - - /** - * suspend_finish - Do final work before exiting suspend sequence. -diff --git a/kernel/power/tuxonice.h b/kernel/power/tuxonice.h -new file mode 100644 -index 0000000..e7bc111 ---- /dev/null -+++ b/kernel/power/tuxonice.h -@@ -0,0 +1,211 @@ -+/* -+ * kernel/power/tuxonice.h -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * It contains declarations used throughout swsusp. -+ * -+ */ -+ -+#ifndef KERNEL_POWER_TOI_H -+#define KERNEL_POWER_TOI_H -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include "tuxonice_pageflags.h" -+#include "power.h" -+ -+#define TOI_CORE_VERSION "3.1" -+#define TOI_HEADER_VERSION 2 -+#define MY_BOOT_KERNEL_DATA_VERSION 3 -+ -+struct toi_boot_kernel_data { -+ int version; -+ int size; -+ unsigned long toi_action; -+ unsigned long toi_debug_state; -+ u32 toi_default_console_level; -+ int toi_io_time[2][2]; -+ char toi_nosave_commandline[COMMAND_LINE_SIZE]; -+ unsigned long pages_used[33]; -+ unsigned long compress_bytes_in; -+ unsigned long compress_bytes_out; -+}; -+ -+extern struct toi_boot_kernel_data toi_bkd; -+ -+/* Location of book kernel data struct in kernel being resumed */ -+extern unsigned long boot_kernel_data_buffer; -+ -+/* == Action states == */ -+ -+enum { -+ TOI_REBOOT, -+ TOI_PAUSE, -+ TOI_LOGALL, -+ TOI_CAN_CANCEL, -+ TOI_KEEP_IMAGE, -+ TOI_FREEZER_TEST, -+ TOI_SINGLESTEP, -+ TOI_PAUSE_NEAR_PAGESET_END, -+ TOI_TEST_FILTER_SPEED, -+ TOI_TEST_BIO, -+ TOI_NO_PAGESET2, -+ TOI_IGNORE_ROOTFS, -+ TOI_REPLACE_SWSUSP, -+ TOI_PAGESET2_FULL, -+ TOI_ABORT_ON_RESAVE_NEEDED, -+ TOI_NO_MULTITHREADED_IO, -+ TOI_NO_DIRECT_LOAD, /* Obsolete */ -+ TOI_LATE_CPU_HOTPLUG, -+ TOI_GET_MAX_MEM_ALLOCD, -+ TOI_NO_FLUSHER_THREAD, -+ TOI_NO_PS2_IF_UNNEEDED -+}; -+ -+#define clear_action_state(bit) (test_and_clear_bit(bit, &toi_bkd.toi_action)) -+ -+/* == Result states == */ -+ -+enum { -+ TOI_ABORTED, -+ TOI_ABORT_REQUESTED, -+ TOI_NOSTORAGE_AVAILABLE, -+ TOI_INSUFFICIENT_STORAGE, -+ TOI_FREEZING_FAILED, -+ TOI_KEPT_IMAGE, -+ TOI_WOULD_EAT_MEMORY, -+ TOI_UNABLE_TO_FREE_ENOUGH_MEMORY, -+ TOI_PM_SEM, -+ TOI_DEVICE_REFUSED, -+ TOI_SYSDEV_REFUSED, -+ TOI_EXTRA_PAGES_ALLOW_TOO_SMALL, -+ TOI_UNABLE_TO_PREPARE_IMAGE, -+ TOI_FAILED_MODULE_INIT, -+ TOI_FAILED_MODULE_CLEANUP, -+ TOI_FAILED_IO, -+ TOI_OUT_OF_MEMORY, -+ TOI_IMAGE_ERROR, -+ TOI_PLATFORM_PREP_FAILED, -+ TOI_CPU_HOTPLUG_FAILED, -+ TOI_ARCH_PREPARE_FAILED, -+ TOI_RESAVE_NEEDED, -+ TOI_CANT_SUSPEND, -+ TOI_NOTIFIERS_PREPARE_FAILED, -+ TOI_PRE_SNAPSHOT_FAILED, -+ TOI_PRE_RESTORE_FAILED, -+ TOI_USERMODE_HELPERS_ERR, -+ TOI_CANT_USE_ALT_RESUME, -+ TOI_HEADER_TOO_BIG, -+ TOI_NUM_RESULT_STATES /* Used in printing debug info only */ -+}; -+ -+extern unsigned long toi_result; -+ -+#define set_result_state(bit) (test_and_set_bit(bit, &toi_result)) -+#define set_abort_result(bit) (test_and_set_bit(TOI_ABORTED, &toi_result), \ -+ test_and_set_bit(bit, &toi_result)) -+#define clear_result_state(bit) (test_and_clear_bit(bit, &toi_result)) -+#define test_result_state(bit) (test_bit(bit, &toi_result)) -+ -+/* == Debug sections and levels == */ -+ -+/* debugging levels. */ -+enum { -+ TOI_STATUS = 0, -+ TOI_ERROR = 2, -+ TOI_LOW, -+ TOI_MEDIUM, -+ TOI_HIGH, -+ TOI_VERBOSE, -+}; -+ -+enum { -+ TOI_ANY_SECTION, -+ TOI_EAT_MEMORY, -+ TOI_IO, -+ TOI_HEADER, -+ TOI_WRITER, -+ TOI_MEMORY, -+}; -+ -+#define set_debug_state(bit) (test_and_set_bit(bit, &toi_bkd.toi_debug_state)) -+#define clear_debug_state(bit) \ -+ (test_and_clear_bit(bit, &toi_bkd.toi_debug_state)) -+#define test_debug_state(bit) (test_bit(bit, &toi_bkd.toi_debug_state)) -+ -+/* == Steps in hibernating == */ -+ -+enum { -+ STEP_HIBERNATE_PREPARE_IMAGE, -+ STEP_HIBERNATE_SAVE_IMAGE, -+ STEP_HIBERNATE_POWERDOWN, -+ STEP_RESUME_CAN_RESUME, -+ STEP_RESUME_LOAD_PS1, -+ STEP_RESUME_DO_RESTORE, -+ STEP_RESUME_READ_PS2, -+ STEP_RESUME_GO, -+ STEP_RESUME_ALT_IMAGE, -+ STEP_CLEANUP, -+ STEP_QUIET_CLEANUP -+}; -+ -+/* == TuxOnIce states == -+ (see also include/linux/suspend.h) */ -+ -+#define get_toi_state() (toi_state) -+#define restore_toi_state(saved_state) \ -+ do { toi_state = saved_state; } while (0) -+ -+/* == Module support == */ -+ -+struct toi_core_fns { -+ int (*post_context_save)(void); -+ unsigned long (*get_nonconflicting_page)(void); -+ int (*try_hibernate)(void); -+ void (*try_resume)(void); -+}; -+ -+extern struct toi_core_fns *toi_core_fns; -+ -+/* == All else == */ -+#define KB(x) ((x) << (PAGE_SHIFT - 10)) -+#define MB(x) ((x) >> (20 - PAGE_SHIFT)) -+ -+extern int toi_start_anything(int toi_or_resume); -+extern void toi_finish_anything(int toi_or_resume); -+ -+extern int save_image_part1(void); -+extern int toi_atomic_restore(void); -+ -+extern int toi_try_hibernate(void); -+extern void toi_try_resume(void); -+ -+extern int __toi_post_context_save(void); -+ -+extern unsigned int nr_hibernates; -+extern char alt_resume_param[256]; -+ -+extern void copyback_post(void); -+extern int toi_hibernate(void); -+extern unsigned long extra_pd1_pages_used; -+ -+#define SECTOR_SIZE 512 -+ -+extern void toi_early_boot_message(int can_erase_image, int default_answer, -+ char *warning_reason, ...); -+ -+extern int do_check_can_resume(void); -+extern int do_toi_step(int step); -+extern int toi_launch_userspace_program(char *command, int channel_no, -+ enum umh_wait wait, int debug); -+ -+extern char tuxonice_signature[9]; -+extern int freezer_sync; -+#endif -diff --git a/kernel/power/tuxonice_alloc.c b/kernel/power/tuxonice_alloc.c -new file mode 100644 -index 0000000..891c5b2 ---- /dev/null -+++ b/kernel/power/tuxonice_alloc.c -@@ -0,0 +1,313 @@ -+/* -+ * kernel/power/tuxonice_alloc.c -+ * -+ * Copyright (C) 2008-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ */ -+ -+#ifdef CONFIG_PM_DEBUG -+#include -+#include -+#include "tuxonice_modules.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice.h" -+ -+#define TOI_ALLOC_PATHS 40 -+ -+static DEFINE_MUTEX(toi_alloc_mutex); -+ -+static struct toi_module_ops toi_alloc_ops; -+ -+static int toi_fail_num; -+static int trace_allocs; -+static atomic_t toi_alloc_count[TOI_ALLOC_PATHS], -+ toi_free_count[TOI_ALLOC_PATHS], -+ toi_test_count[TOI_ALLOC_PATHS], -+ toi_fail_count[TOI_ALLOC_PATHS]; -+static int toi_cur_allocd[TOI_ALLOC_PATHS], toi_max_allocd[TOI_ALLOC_PATHS]; -+static int cur_allocd, max_allocd; -+ -+static char *toi_alloc_desc[TOI_ALLOC_PATHS] = { -+ "", /* 0 */ -+ "get_io_info_struct", -+ "extent", -+ "extent (loading chain)", -+ "userui channel", -+ "userui arg", /* 5 */ -+ "attention list metadata", -+ "extra pagedir memory metadata", -+ "bdev metadata", -+ "extra pagedir memory", -+ "header_locations_read", /* 10 */ -+ "bio queue", -+ "prepare_readahead", -+ "i/o buffer", -+ "writer buffer in bio_init", -+ "checksum buffer", /* 15 */ -+ "compression buffer", -+ "filewriter signature op", -+ "set resume param alloc1", -+ "set resume param alloc2", -+ "debugging info buffer", /* 20 */ -+ "check can resume buffer", -+ "write module config buffer", -+ "read module config buffer", -+ "write image header buffer", -+ "read pageset1 buffer", /* 25 */ -+ "get_have_image_data buffer", -+ "checksum page", -+ "worker rw loop", -+ "get nonconflicting page", -+ "ps1 load addresses", /* 30 */ -+ "remove swap image", -+ "swap image exists", -+ "swap parse sig location", -+ "sysfs kobj", -+ "swap mark resume attempted buffer", /* 35 */ -+ "cluster member", -+ "boot kernel data buffer", -+ "setting swap signature", -+ "block i/o bdev struct" -+}; -+ -+#define MIGHT_FAIL(FAIL_NUM, FAIL_VAL) \ -+ do { \ -+ BUG_ON(FAIL_NUM >= TOI_ALLOC_PATHS); \ -+ \ -+ if (FAIL_NUM == toi_fail_num) { \ -+ atomic_inc(&toi_test_count[FAIL_NUM]); \ -+ toi_fail_num = 0; \ -+ return FAIL_VAL; \ -+ } \ -+ } while (0) -+ -+static void alloc_update_stats(int fail_num, void *result, int size) -+{ -+ if (!result) { -+ atomic_inc(&toi_fail_count[fail_num]); -+ return; -+ } -+ -+ atomic_inc(&toi_alloc_count[fail_num]); -+ if (unlikely(test_action_state(TOI_GET_MAX_MEM_ALLOCD))) { -+ mutex_lock(&toi_alloc_mutex); -+ toi_cur_allocd[fail_num]++; -+ cur_allocd += size; -+ if (unlikely(cur_allocd > max_allocd)) { -+ int i; -+ -+ for (i = 0; i < TOI_ALLOC_PATHS; i++) -+ toi_max_allocd[i] = toi_cur_allocd[i]; -+ max_allocd = cur_allocd; -+ } -+ mutex_unlock(&toi_alloc_mutex); -+ } -+} -+ -+static void free_update_stats(int fail_num, int size) -+{ -+ BUG_ON(fail_num >= TOI_ALLOC_PATHS); -+ atomic_inc(&toi_free_count[fail_num]); -+ if (unlikely(atomic_read(&toi_free_count[fail_num]) > -+ atomic_read(&toi_alloc_count[fail_num]))) -+ dump_stack(); -+ if (unlikely(test_action_state(TOI_GET_MAX_MEM_ALLOCD))) { -+ mutex_lock(&toi_alloc_mutex); -+ cur_allocd -= size; -+ toi_cur_allocd[fail_num]--; -+ mutex_unlock(&toi_alloc_mutex); -+ } -+} -+ -+void *toi_kzalloc(int fail_num, size_t size, gfp_t flags) -+{ -+ void *result; -+ -+ if (toi_alloc_ops.enabled) -+ MIGHT_FAIL(fail_num, NULL); -+ result = kzalloc(size, flags); -+ if (toi_alloc_ops.enabled) -+ alloc_update_stats(fail_num, result, size); -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ return result; -+} -+EXPORT_SYMBOL_GPL(toi_kzalloc); -+ -+unsigned long toi_get_free_pages(int fail_num, gfp_t mask, -+ unsigned int order) -+{ -+ unsigned long result; -+ -+ if (toi_alloc_ops.enabled) -+ MIGHT_FAIL(fail_num, 0); -+ result = __get_free_pages(mask, order); -+ if (toi_alloc_ops.enabled) -+ alloc_update_stats(fail_num, (void *) result, -+ PAGE_SIZE << order); -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ return result; -+} -+EXPORT_SYMBOL_GPL(toi_get_free_pages); -+ -+struct page *toi_alloc_page(int fail_num, gfp_t mask) -+{ -+ struct page *result; -+ -+ if (toi_alloc_ops.enabled) -+ MIGHT_FAIL(fail_num, NULL); -+ result = alloc_page(mask); -+ if (toi_alloc_ops.enabled) -+ alloc_update_stats(fail_num, (void *) result, PAGE_SIZE); -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ return result; -+} -+EXPORT_SYMBOL_GPL(toi_alloc_page); -+ -+unsigned long toi_get_zeroed_page(int fail_num, gfp_t mask) -+{ -+ unsigned long result; -+ -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ if (toi_alloc_ops.enabled) -+ MIGHT_FAIL(fail_num, 0); -+ result = get_zeroed_page(mask); -+ if (toi_alloc_ops.enabled) -+ alloc_update_stats(fail_num, (void *) result, PAGE_SIZE); -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ return result; -+} -+EXPORT_SYMBOL_GPL(toi_get_zeroed_page); -+ -+void toi_kfree(int fail_num, const void *arg, int size) -+{ -+ if (arg && toi_alloc_ops.enabled) -+ free_update_stats(fail_num, size); -+ -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ kfree(arg); -+} -+EXPORT_SYMBOL_GPL(toi_kfree); -+ -+void toi_free_page(int fail_num, unsigned long virt) -+{ -+ if (virt && toi_alloc_ops.enabled) -+ free_update_stats(fail_num, PAGE_SIZE); -+ -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ free_page(virt); -+} -+EXPORT_SYMBOL_GPL(toi_free_page); -+ -+void toi__free_page(int fail_num, struct page *page) -+{ -+ if (page && toi_alloc_ops.enabled) -+ free_update_stats(fail_num, PAGE_SIZE); -+ -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ __free_page(page); -+} -+EXPORT_SYMBOL_GPL(toi__free_page); -+ -+void toi_free_pages(int fail_num, struct page *page, int order) -+{ -+ if (page && toi_alloc_ops.enabled) -+ free_update_stats(fail_num, PAGE_SIZE << order); -+ -+ if (fail_num == trace_allocs) -+ dump_stack(); -+ __free_pages(page, order); -+} -+ -+void toi_alloc_print_debug_stats(void) -+{ -+ int i, header_done = 0; -+ -+ if (!toi_alloc_ops.enabled) -+ return; -+ -+ for (i = 0; i < TOI_ALLOC_PATHS; i++) -+ if (atomic_read(&toi_alloc_count[i]) != -+ atomic_read(&toi_free_count[i])) { -+ if (!header_done) { -+ printk(KERN_INFO "Idx Allocs Frees Tests " -+ " Fails Max Description\n"); -+ header_done = 1; -+ } -+ -+ printk(KERN_INFO "%3d %7d %7d %7d %7d %7d %s\n", i, -+ atomic_read(&toi_alloc_count[i]), -+ atomic_read(&toi_free_count[i]), -+ atomic_read(&toi_test_count[i]), -+ atomic_read(&toi_fail_count[i]), -+ toi_max_allocd[i], -+ toi_alloc_desc[i]); -+ } -+} -+EXPORT_SYMBOL_GPL(toi_alloc_print_debug_stats); -+ -+static int toi_alloc_initialise(int starting_cycle) -+{ -+ int i; -+ -+ if (!starting_cycle) -+ return 0; -+ -+ for (i = 0; i < TOI_ALLOC_PATHS; i++) { -+ atomic_set(&toi_alloc_count[i], 0); -+ atomic_set(&toi_free_count[i], 0); -+ atomic_set(&toi_test_count[i], 0); -+ atomic_set(&toi_fail_count[i], 0); -+ toi_cur_allocd[i] = 0; -+ toi_max_allocd[i] = 0; -+ }; -+ -+ max_allocd = 0; -+ cur_allocd = 0; -+ return 0; -+} -+ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_INT("failure_test", SYSFS_RW, &toi_fail_num, 0, 99, 0, NULL), -+ SYSFS_INT("trace", SYSFS_RW, &trace_allocs, 0, TOI_ALLOC_PATHS, 0, -+ NULL), -+ SYSFS_BIT("find_max_mem_allocated", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_GET_MAX_MEM_ALLOCD, 0), -+ SYSFS_INT("enabled", SYSFS_RW, &toi_alloc_ops.enabled, 0, 1, 0, -+ NULL) -+}; -+ -+static struct toi_module_ops toi_alloc_ops = { -+ .type = MISC_HIDDEN_MODULE, -+ .name = "allocation debugging", -+ .directory = "alloc", -+ .module = THIS_MODULE, -+ .early = 1, -+ .initialise = toi_alloc_initialise, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+int toi_alloc_init(void) -+{ -+ int result = toi_register_module(&toi_alloc_ops); -+ return result; -+} -+ -+void toi_alloc_exit(void) -+{ -+ toi_unregister_module(&toi_alloc_ops); -+} -+#endif -diff --git a/kernel/power/tuxonice_alloc.h b/kernel/power/tuxonice_alloc.h -new file mode 100644 -index 0000000..6cd19ba ---- /dev/null -+++ b/kernel/power/tuxonice_alloc.h -@@ -0,0 +1,51 @@ -+/* -+ * kernel/power/tuxonice_alloc.h -+ * -+ * Copyright (C) 2008-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ */ -+ -+#define TOI_WAIT_GFP (GFP_NOFS | __GFP_NOWARN) -+#define TOI_ATOMIC_GFP (GFP_ATOMIC | __GFP_NOWARN) -+ -+#ifdef CONFIG_PM_DEBUG -+extern void *toi_kzalloc(int fail_num, size_t size, gfp_t flags); -+extern void toi_kfree(int fail_num, const void *arg, int size); -+ -+extern unsigned long toi_get_free_pages(int fail_num, gfp_t mask, -+ unsigned int order); -+#define toi_get_free_page(FAIL_NUM, MASK) toi_get_free_pages(FAIL_NUM, MASK, 0) -+extern unsigned long toi_get_zeroed_page(int fail_num, gfp_t mask); -+extern void toi_free_page(int fail_num, unsigned long buf); -+extern void toi__free_page(int fail_num, struct page *page); -+extern void toi_free_pages(int fail_num, struct page *page, int order); -+extern struct page *toi_alloc_page(int fail_num, gfp_t mask); -+extern int toi_alloc_init(void); -+extern void toi_alloc_exit(void); -+ -+extern void toi_alloc_print_debug_stats(void); -+ -+#else /* CONFIG_PM_DEBUG */ -+ -+#define toi_kzalloc(FAIL, SIZE, FLAGS) (kzalloc(SIZE, FLAGS)) -+#define toi_kfree(FAIL, ALLOCN, SIZE) (kfree(ALLOCN)) -+ -+#define toi_get_free_pages(FAIL, FLAGS, ORDER) __get_free_pages(FLAGS, ORDER) -+#define toi_get_free_page(FAIL, FLAGS) __get_free_page(FLAGS) -+#define toi_get_zeroed_page(FAIL, FLAGS) get_zeroed_page(FLAGS) -+#define toi_free_page(FAIL, ALLOCN) do { free_page(ALLOCN); } while (0) -+#define toi__free_page(FAIL, PAGE) __free_page(PAGE) -+#define toi_free_pages(FAIL, PAGE, ORDER) __free_pages(PAGE, ORDER) -+#define toi_alloc_page(FAIL, MASK) alloc_page(MASK) -+static inline int toi_alloc_init(void) -+{ -+ return 0; -+} -+ -+static inline void toi_alloc_exit(void) { } -+ -+static inline void toi_alloc_print_debug_stats(void) { } -+ -+#endif -diff --git a/kernel/power/tuxonice_atomic_copy.c b/kernel/power/tuxonice_atomic_copy.c -new file mode 100644 -index 0000000..1807f8b ---- /dev/null -+++ b/kernel/power/tuxonice_atomic_copy.c -@@ -0,0 +1,418 @@ -+/* -+ * kernel/power/tuxonice_atomic_copy.c -+ * -+ * Copyright 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * Routines for doing the atomic save/restore. -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include "tuxonice.h" -+#include "tuxonice_storage.h" -+#include "tuxonice_power_off.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_io.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice_pageflags.h" -+#include "tuxonice_checksum.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_atomic_copy.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_modules.h" -+ -+unsigned long extra_pd1_pages_used; -+ -+/** -+ * free_pbe_list - free page backup entries used by the atomic copy code. -+ * @list: List to free. -+ * @highmem: Whether the list is in highmem. -+ * -+ * Normally, this function isn't used. If, however, we need to abort before -+ * doing the atomic copy, we use this to free the pbes previously allocated. -+ **/ -+static void free_pbe_list(struct pbe **list, int highmem) -+{ -+ while (*list) { -+ int i; -+ struct pbe *free_pbe, *next_page = NULL; -+ struct page *page; -+ -+ if (highmem) { -+ page = (struct page *) *list; -+ free_pbe = (struct pbe *) kmap(page); -+ } else { -+ page = virt_to_page(*list); -+ free_pbe = *list; -+ } -+ -+ for (i = 0; i < PBES_PER_PAGE; i++) { -+ if (!free_pbe) -+ break; -+ if (highmem) -+ toi__free_page(29, free_pbe->address); -+ else -+ toi_free_page(29, -+ (unsigned long) free_pbe->address); -+ free_pbe = free_pbe->next; -+ } -+ -+ if (highmem) { -+ if (free_pbe) -+ next_page = free_pbe; -+ kunmap(page); -+ } else { -+ if (free_pbe) -+ next_page = free_pbe; -+ } -+ -+ toi__free_page(29, page); -+ *list = (struct pbe *) next_page; -+ }; -+} -+ -+/** -+ * copyback_post - post atomic-restore actions -+ * -+ * After doing the atomic restore, we have a few more things to do: -+ * 1) We want to retain some values across the restore, so we now copy -+ * these from the nosave variables to the normal ones. -+ * 2) Set the status flags. -+ * 3) Resume devices. -+ * 4) Tell userui so it can redraw & restore settings. -+ * 5) Reread the page cache. -+ **/ -+void copyback_post(void) -+{ -+ struct toi_boot_kernel_data *bkd = -+ (struct toi_boot_kernel_data *) boot_kernel_data_buffer; -+ -+ /* -+ * The boot kernel's data may be larger (newer version) or -+ * smaller (older version) than ours. Copy the minimum -+ * of the two sizes, so that we don't overwrite valid values -+ * from pre-atomic copy. -+ */ -+ -+ memcpy(&toi_bkd, (char *) boot_kernel_data_buffer, -+ min_t(int, sizeof(struct toi_boot_kernel_data), -+ bkd->size)); -+ -+ if (toi_activate_storage(1)) -+ panic("Failed to reactivate our storage."); -+ -+ toi_post_atomic_restore_modules(bkd); -+ -+ toi_cond_pause(1, "About to reload secondary pagedir."); -+ -+ if (read_pageset2(0)) -+ panic("Unable to successfully reread the page cache."); -+ -+ /* -+ * If the user wants to sleep again after resuming from full-off, -+ * it's most likely to be in order to suspend to ram, so we'll -+ * do this check after loading pageset2, to give them the fastest -+ * wakeup when they are ready to use the computer again. -+ */ -+ toi_check_resleep(); -+} -+ -+/** -+ * toi_copy_pageset1 - do the atomic copy of pageset1 -+ * -+ * Make the atomic copy of pageset1. We can't use copy_page (as we once did) -+ * because we can't be sure what side effects it has. On my old Duron, with -+ * 3DNOW, kernel_fpu_begin increments preempt count, making our preempt -+ * count at resume time 4 instead of 3. -+ * -+ * We don't want to call kmap_atomic unconditionally because it has the side -+ * effect of incrementing the preempt count, which will leave it one too high -+ * post resume (the page containing the preempt count will be copied after -+ * its incremented. This is essentially the same problem. -+ **/ -+void toi_copy_pageset1(void) -+{ -+ int i; -+ unsigned long source_index, dest_index; -+ -+ memory_bm_position_reset(pageset1_map); -+ memory_bm_position_reset(pageset1_copy_map); -+ -+ source_index = memory_bm_next_pfn(pageset1_map); -+ dest_index = memory_bm_next_pfn(pageset1_copy_map); -+ -+ for (i = 0; i < pagedir1.size; i++) { -+ unsigned long *origvirt, *copyvirt; -+ struct page *origpage, *copypage; -+ int loop = (PAGE_SIZE / sizeof(unsigned long)) - 1, -+ was_present1, was_present2; -+ -+ origpage = pfn_to_page(source_index); -+ copypage = pfn_to_page(dest_index); -+ -+ origvirt = PageHighMem(origpage) ? -+ kmap_atomic(origpage, KM_USER0) : -+ page_address(origpage); -+ -+ copyvirt = PageHighMem(copypage) ? -+ kmap_atomic(copypage, KM_USER1) : -+ page_address(copypage); -+ -+ was_present1 = kernel_page_present(origpage); -+ if (!was_present1) -+ kernel_map_pages(origpage, 1, 1); -+ -+ was_present2 = kernel_page_present(copypage); -+ if (!was_present2) -+ kernel_map_pages(copypage, 1, 1); -+ -+ while (loop >= 0) { -+ *(copyvirt + loop) = *(origvirt + loop); -+ loop--; -+ } -+ -+ if (!was_present1) -+ kernel_map_pages(origpage, 1, 0); -+ -+ if (!was_present2) -+ kernel_map_pages(copypage, 1, 0); -+ -+ if (PageHighMem(origpage)) -+ kunmap_atomic(origvirt, KM_USER0); -+ -+ if (PageHighMem(copypage)) -+ kunmap_atomic(copyvirt, KM_USER1); -+ -+ source_index = memory_bm_next_pfn(pageset1_map); -+ dest_index = memory_bm_next_pfn(pageset1_copy_map); -+ } -+} -+ -+/** -+ * __toi_post_context_save - steps after saving the cpu context -+ * -+ * Steps taken after saving the CPU state to make the actual -+ * atomic copy. -+ * -+ * Called from swsusp_save in snapshot.c via toi_post_context_save. -+ **/ -+int __toi_post_context_save(void) -+{ -+ unsigned long old_ps1_size = pagedir1.size; -+ -+ check_checksums(); -+ -+ free_checksum_pages(); -+ -+ toi_recalculate_image_contents(1); -+ -+ extra_pd1_pages_used = pagedir1.size > old_ps1_size ? -+ pagedir1.size - old_ps1_size : 0; -+ -+ if (extra_pd1_pages_used > extra_pd1_pages_allowance) { -+ printk(KERN_INFO "Pageset1 has grown by %lu pages. " -+ "extra_pages_allowance is currently only %lu.\n", -+ pagedir1.size - old_ps1_size, -+ extra_pd1_pages_allowance); -+ -+ /* -+ * Highlevel code will see this, clear the state and -+ * retry if we haven't already done so twice. -+ */ -+ set_abort_result(TOI_EXTRA_PAGES_ALLOW_TOO_SMALL); -+ return 1; -+ } -+ -+ if (!test_action_state(TOI_TEST_FILTER_SPEED) && -+ !test_action_state(TOI_TEST_BIO)) -+ toi_copy_pageset1(); -+ -+ return 0; -+} -+ -+/** -+ * toi_hibernate - high level code for doing the atomic copy -+ * -+ * High-level code which prepares to do the atomic copy. Loosely based -+ * on the swsusp version, but with the following twists: -+ * - We set toi_running so the swsusp code uses our code paths. -+ * - We give better feedback regarding what goes wrong if there is a -+ * problem. -+ * - We use an extra function to call the assembly, just in case this code -+ * is in a module (return address). -+ **/ -+int toi_hibernate(void) -+{ -+ int error; -+ -+ toi_running = 1; /* For the swsusp code we use :< */ -+ -+ error = toi_lowlevel_builtin(); -+ -+ toi_running = 0; -+ return error; -+} -+ -+/** -+ * toi_atomic_restore - prepare to do the atomic restore -+ * -+ * Get ready to do the atomic restore. This part gets us into the same -+ * state we are in prior to do calling do_toi_lowlevel while -+ * hibernating: hot-unplugging secondary cpus and freeze processes, -+ * before starting the thread that will do the restore. -+ **/ -+int toi_atomic_restore(void) -+{ -+ int error; -+ -+ toi_running = 1; -+ -+ toi_prepare_status(DONT_CLEAR_BAR, "Atomic restore."); -+ -+ memcpy(&toi_bkd.toi_nosave_commandline, saved_command_line, -+ strlen(saved_command_line)); -+ -+ toi_pre_atomic_restore_modules(&toi_bkd); -+ -+ if (add_boot_kernel_data_pbe()) -+ goto Failed; -+ -+ toi_prepare_status(DONT_CLEAR_BAR, "Doing atomic copy/restore."); -+ -+ if (toi_go_atomic(PMSG_QUIESCE, 0)) -+ goto Failed; -+ -+ /* We'll ignore saved state, but this gets preempt count (etc) right */ -+ save_processor_state(); -+ -+ error = swsusp_arch_resume(); -+ /* -+ * Code below is only ever reached in case of failure. Otherwise -+ * execution continues at place where swsusp_arch_suspend was called. -+ * -+ * We don't know whether it's safe to continue (this shouldn't happen), -+ * so lets err on the side of caution. -+ */ -+ BUG(); -+ -+Failed: -+ free_pbe_list(&restore_pblist, 0); -+#ifdef CONFIG_HIGHMEM -+ free_pbe_list(&restore_highmem_pblist, 1); -+#endif -+ toi_running = 0; -+ return 1; -+} -+ -+/** -+ * toi_go_atomic - do the actual atomic copy/restore -+ * @state: The state to use for dpm_suspend_start & power_down calls. -+ * @suspend_time: Whether we're suspending or resuming. -+ **/ -+int toi_go_atomic(pm_message_t state, int suspend_time) -+{ -+ if (suspend_time && platform_begin(1)) { -+ set_abort_result(TOI_PLATFORM_PREP_FAILED); -+ return 1; -+ } -+ -+ suspend_console(); -+ -+ if (dpm_suspend_start(state)) { -+ set_abort_result(TOI_DEVICE_REFUSED); -+ toi_end_atomic(ATOMIC_STEP_DEVICE_RESUME, suspend_time, 3); -+ return 1; -+ } -+ -+ if (suspend_time && arch_prepare_suspend()) { -+ set_abort_result(TOI_ARCH_PREPARE_FAILED); -+ toi_end_atomic(ATOMIC_STEP_DEVICE_RESUME, suspend_time, 1); -+ return 1; -+ } -+ -+ /* At this point, dpm_suspend_start() has been called, but *not* -+ * dpm_suspend_noirq(). We *must* dpm_suspend_noirq() now. -+ * Otherwise, drivers for some devices (e.g. interrupt controllers) -+ * become desynchronized with the actual state of the hardware -+ * at resume time, and evil weirdness ensues. -+ */ -+ -+ if (dpm_suspend_noirq(state)) { -+ set_abort_result(TOI_DEVICE_REFUSED); -+ toi_end_atomic(ATOMIC_STEP_DEVICE_RESUME, suspend_time, 1); -+ return 1; -+ } -+ -+ if (suspend_time && platform_pre_snapshot(1)) { -+ set_abort_result(TOI_PRE_SNAPSHOT_FAILED); -+ toi_end_atomic(ATOMIC_STEP_PLATFORM_FINISH, suspend_time, 1); -+ return 1; -+ } -+ -+ if (!suspend_time && platform_pre_restore(1)) { -+ set_abort_result(TOI_PRE_RESTORE_FAILED); -+ toi_end_atomic(ATOMIC_STEP_PLATFORM_FINISH, suspend_time, 1); -+ return 1; -+ } -+ -+ if (test_action_state(TOI_LATE_CPU_HOTPLUG)) { -+ if (disable_nonboot_cpus()) { -+ set_abort_result(TOI_CPU_HOTPLUG_FAILED); -+ toi_end_atomic(ATOMIC_STEP_CPU_HOTPLUG, -+ suspend_time, 1); -+ return 1; -+ } -+ } -+ -+ local_irq_disable(); -+ -+ if (sysdev_suspend(state)) { -+ set_abort_result(TOI_SYSDEV_REFUSED); -+ toi_end_atomic(ATOMIC_STEP_IRQS, suspend_time, 1); -+ return 1; -+ } -+ -+ return 0; -+} -+ -+/** -+ * toi_end_atomic - post atomic copy/restore routines -+ * @stage: What step to start at. -+ * @suspend_time: Whether we're suspending or resuming. -+ * @error: Whether we're recovering from an error. -+ **/ -+void toi_end_atomic(int stage, int suspend_time, int error) -+{ -+ switch (stage) { -+ case ATOMIC_ALL_STEPS: -+ if (!suspend_time) -+ platform_leave(1); -+ sysdev_resume(); -+ case ATOMIC_STEP_IRQS: -+ local_irq_enable(); -+ case ATOMIC_STEP_CPU_HOTPLUG: -+ if (test_action_state(TOI_LATE_CPU_HOTPLUG)) -+ enable_nonboot_cpus(); -+ platform_restore_cleanup(1); -+ case ATOMIC_STEP_PLATFORM_FINISH: -+ platform_finish(1); -+ dpm_resume_noirq(suspend_time ? -+ (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); -+ case ATOMIC_STEP_DEVICE_RESUME: -+ if (suspend_time && (error & 2)) -+ platform_recover(1); -+ dpm_resume_end(suspend_time ? -+ ((error & 1) ? PMSG_RECOVER : PMSG_THAW) : -+ PMSG_RESTORE); -+ resume_console(); -+ platform_end(1); -+ -+ toi_prepare_status(DONT_CLEAR_BAR, "Post atomic."); -+ } -+} -diff --git a/kernel/power/tuxonice_atomic_copy.h b/kernel/power/tuxonice_atomic_copy.h -new file mode 100644 -index 0000000..e61b27b ---- /dev/null -+++ b/kernel/power/tuxonice_atomic_copy.h -@@ -0,0 +1,20 @@ -+/* -+ * kernel/power/tuxonice_atomic_copy.h -+ * -+ * Copyright 2008-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * Routines for doing the atomic save/restore. -+ */ -+ -+enum { -+ ATOMIC_ALL_STEPS, -+ ATOMIC_STEP_IRQS, -+ ATOMIC_STEP_CPU_HOTPLUG, -+ ATOMIC_STEP_PLATFORM_FINISH, -+ ATOMIC_STEP_DEVICE_RESUME, -+}; -+ -+int toi_go_atomic(pm_message_t state, int toi_time); -+void toi_end_atomic(int stage, int toi_time, int error); -diff --git a/kernel/power/tuxonice_bio.h b/kernel/power/tuxonice_bio.h -new file mode 100644 -index 0000000..9627ccc ---- /dev/null -+++ b/kernel/power/tuxonice_bio.h -@@ -0,0 +1,77 @@ -+/* -+ * kernel/power/tuxonice_bio.h -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * This file contains declarations for functions exported from -+ * tuxonice_bio.c, which contains low level io functions. -+ */ -+ -+#include -+#include "tuxonice_extent.h" -+ -+void toi_put_extent_chain(struct hibernate_extent_chain *chain); -+int toi_add_to_extent_chain(struct hibernate_extent_chain *chain, -+ unsigned long start, unsigned long end); -+ -+struct hibernate_extent_saved_state { -+ int extent_num; -+ struct hibernate_extent *extent_ptr; -+ unsigned long offset; -+}; -+ -+struct toi_bdev_info { -+ struct toi_bdev_info *next; -+ struct hibernate_extent_chain blocks; -+ struct block_device *bdev; -+ struct toi_module_ops *allocator; -+ int allocator_index; -+ struct hibernate_extent_chain allocations; -+ char name[266]; /* "swap on " or "file " + up to 256 chars */ -+ -+ /* Saved in header */ -+ char uuid[17]; -+ dev_t dev_t; -+ int prio; -+ int bmap_shift; -+ int blocks_per_page; -+ unsigned long pages_used; -+ struct hibernate_extent_saved_state saved_state[4]; -+}; -+ -+struct toi_extent_iterate_state { -+ struct toi_bdev_info *current_chain; -+ int num_chains; -+ int saved_chain_number[4]; -+ struct toi_bdev_info *saved_chain_ptr[4]; -+}; -+ -+/* -+ * Our exported interface so the swapwriter and filewriter don't -+ * need these functions duplicated. -+ */ -+struct toi_bio_ops { -+ int (*bdev_page_io) (int rw, struct block_device *bdev, long pos, -+ struct page *page); -+ int (*register_storage)(struct toi_bdev_info *new); -+ void (*free_storage)(void); -+}; -+ -+struct toi_allocator_ops { -+ unsigned long (*toi_swap_storage_available) (void); -+}; -+ -+extern struct toi_bio_ops toi_bio_ops; -+ -+extern char *toi_writer_buffer; -+extern int toi_writer_buffer_posn; -+ -+struct toi_bio_allocator_ops { -+ int (*register_storage) (void); -+ unsigned long (*storage_available)(void); -+ int (*allocate_storage) (struct toi_bdev_info *, unsigned long); -+ int (*bmap) (struct toi_bdev_info *); -+ void (*free_storage) (struct toi_bdev_info *); -+}; -diff --git a/kernel/power/tuxonice_bio_chains.c b/kernel/power/tuxonice_bio_chains.c -new file mode 100644 -index 0000000..2ac2042 ---- /dev/null -+++ b/kernel/power/tuxonice_bio_chains.c -@@ -0,0 +1,1044 @@ -+/* -+ * kernel/power/tuxonice_bio_devinfo.c -+ * -+ * Copyright (C) 2009-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ */ -+ -+#include -+#include "tuxonice_bio.h" -+#include "tuxonice_bio_internal.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_ui.h" -+#include "tuxonice.h" -+#include "tuxonice_io.h" -+ -+static struct toi_bdev_info *prio_chain_head; -+static int num_chains; -+ -+/* Pointer to current entry being loaded/saved. */ -+struct toi_extent_iterate_state toi_writer_posn; -+ -+#define metadata_size (sizeof(struct toi_bdev_info) - \ -+ offsetof(struct toi_bdev_info, uuid)) -+ -+/* -+ * After section 0 (header) comes 2 => next_section[0] = 2 -+ */ -+static int next_section[3] = { 2, 3, 1 }; -+ -+/** -+ * dump_block_chains - print the contents of the bdev info array. -+ **/ -+void dump_block_chains(void) -+{ -+ int i = 0; -+ int j; -+ struct toi_bdev_info *cur_chain = prio_chain_head; -+ -+ while (cur_chain) { -+ struct hibernate_extent *this = cur_chain->blocks.first; -+ -+ printk(KERN_DEBUG "Chain %d (prio %d):", i, cur_chain->prio); -+ -+ while (this) { -+ printk(KERN_CONT " [%lu-%lu]%s", this->start, -+ this->end, this->next ? "," : ""); -+ this = this->next; -+ } -+ -+ printk("\n"); -+ cur_chain = cur_chain->next; -+ i++; -+ } -+ -+ printk(KERN_DEBUG "Saved states:\n"); -+ for (i = 0; i < 4; i++) { -+ printk(KERN_DEBUG "Slot %d: Chain %d.\n", -+ i, toi_writer_posn.saved_chain_number[i]); -+ -+ cur_chain = prio_chain_head; -+ j = 0; -+ while (cur_chain) { -+ printk(KERN_DEBUG " Chain %d: Extent %d. Offset %lu.\n", -+ j, cur_chain->saved_state[i].extent_num, -+ cur_chain->saved_state[i].offset); -+ cur_chain = cur_chain->next; -+ j++; -+ } -+ printk(KERN_CONT "\n"); -+ } -+} -+ -+/** -+ * -+ **/ -+static void toi_extent_chain_next(void) -+{ -+ struct toi_bdev_info *this = toi_writer_posn.current_chain; -+ -+ if (!this->blocks.current_extent) -+ return; -+ -+ if (this->blocks.current_offset == this->blocks.current_extent->end) { -+ if (this->blocks.current_extent->next) { -+ this->blocks.current_extent = -+ this->blocks.current_extent->next; -+ this->blocks.current_offset = -+ this->blocks.current_extent->start; -+ } else { -+ this->blocks.current_extent = NULL; -+ this->blocks.current_offset = 0; -+ } -+ } else -+ this->blocks.current_offset++; -+} -+ -+/** -+ * -+ */ -+ -+static struct toi_bdev_info *__find_next_chain_same_prio(void) -+{ -+ struct toi_bdev_info *start_chain = toi_writer_posn.current_chain; -+ struct toi_bdev_info *this = start_chain; -+ int orig_prio = this->prio; -+ -+ do { -+ this = this->next; -+ -+ if (!this) -+ this = prio_chain_head; -+ -+ /* Back on original chain? Use it again. */ -+ if (this == start_chain) -+ return start_chain; -+ -+ } while (!this->blocks.current_extent || this->prio != orig_prio); -+ -+ return this; -+} -+ -+static void find_next_chain(void) -+{ -+ struct toi_bdev_info *this; -+ -+ this = __find_next_chain_same_prio(); -+ -+ /* -+ * If we didn't get another chain of the same priority that we -+ * can use, look for the next priority. -+ */ -+ while (this && !this->blocks.current_extent) -+ this = this->next; -+ -+ toi_writer_posn.current_chain = this; -+} -+ -+/** -+ * toi_extent_state_next - go to the next extent -+ * @blocks: The number of values to progress. -+ * @stripe_mode: Whether to spread usage across all chains. -+ * -+ * Given a state, progress to the next valid entry. We may begin in an -+ * invalid state, as we do when invoked after extent_state_goto_start below. -+ * -+ * When using compression and expected_compression > 0, we let the image size -+ * be larger than storage, so we can validly run out of data to return. -+ **/ -+static unsigned long toi_extent_state_next(int blocks, int current_stream) -+{ -+ int i; -+ -+ if (!toi_writer_posn.current_chain) -+ return -ENOSPC; -+ -+ /* Assume chains always have lengths that are multiples of @blocks */ -+ for (i = 0; i < blocks; i++) -+ toi_extent_chain_next(); -+ -+ /* The header stream is not striped */ -+ if (current_stream || -+ !toi_writer_posn.current_chain->blocks.current_extent) -+ find_next_chain(); -+ -+ return toi_writer_posn.current_chain ? 0 : -ENOSPC; -+} -+ -+static void toi_insert_chain_in_prio_list(struct toi_bdev_info *this) -+{ -+ struct toi_bdev_info **prev_ptr; -+ struct toi_bdev_info *cur; -+ -+ /* Loop through the existing chain, finding where to insert it */ -+ prev_ptr = &prio_chain_head; -+ cur = prio_chain_head; -+ -+ while (cur && cur->prio >= this->prio) { -+ prev_ptr = &cur->next; -+ cur = cur->next; -+ } -+ -+ this->next = *prev_ptr; -+ *prev_ptr = this; -+ -+ this = prio_chain_head; -+ while (this) -+ this = this->next; -+ num_chains++; -+} -+ -+/** -+ * toi_extent_state_goto_start - reinitialize an extent chain iterator -+ * @state: Iterator to reinitialize -+ **/ -+void toi_extent_state_goto_start(void) -+{ -+ struct toi_bdev_info *this = prio_chain_head; -+ -+ while (this) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Setting current extent to %p.", this->blocks.first); -+ this->blocks.current_extent = this->blocks.first; -+ if (this->blocks.current_extent) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Setting current offset to %lu.", -+ this->blocks.current_extent->start); -+ this->blocks.current_offset = -+ this->blocks.current_extent->start; -+ } -+ -+ this = this->next; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Setting current chain to %p.", -+ prio_chain_head); -+ toi_writer_posn.current_chain = prio_chain_head; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Leaving extent state goto start."); -+} -+ -+/** -+ * toi_extent_state_save - save state of the iterator -+ * @state: Current state of the chain -+ * @saved_state: Iterator to populate -+ * -+ * Given a state and a struct hibernate_extent_state_store, save the current -+ * position in a format that can be used with relocated chains (at -+ * resume time). -+ **/ -+void toi_extent_state_save(int slot) -+{ -+ struct toi_bdev_info *cur_chain = prio_chain_head; -+ struct hibernate_extent *extent; -+ struct hibernate_extent_saved_state *chain_state; -+ int i = 0; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_extent_state_save, slot %d.", -+ slot); -+ -+ if (!toi_writer_posn.current_chain) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "No current chain => " -+ "chain_num = -1."); -+ toi_writer_posn.saved_chain_number[slot] = -1; -+ return; -+ } -+ -+ while (cur_chain) { -+ i++; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Saving chain %d (%p) " -+ "state, slot %d.", i, cur_chain, slot); -+ -+ chain_state = &cur_chain->saved_state[slot]; -+ -+ chain_state->offset = cur_chain->blocks.current_offset; -+ -+ if (toi_writer_posn.current_chain == cur_chain) { -+ toi_writer_posn.saved_chain_number[slot] = i; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "This is the chain " -+ "we were on => chain_num is %d.", i); -+ } -+ -+ if (!cur_chain->blocks.current_extent) { -+ chain_state->extent_num = 0; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "No current extent " -+ "for this chain => extent_num %d is 0.", -+ i); -+ cur_chain = cur_chain->next; -+ continue; -+ } -+ -+ extent = cur_chain->blocks.first; -+ chain_state->extent_num = 1; -+ -+ while (extent != cur_chain->blocks.current_extent) { -+ chain_state->extent_num++; -+ extent = extent->next; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "extent num %d is %d.", i, -+ chain_state->extent_num); -+ -+ cur_chain = cur_chain->next; -+ } -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Completed saving extent state slot %d.", slot); -+} -+ -+/** -+ * toi_extent_state_restore - restore the position saved by extent_state_save -+ * @state: State to populate -+ * @saved_state: Iterator saved to restore -+ **/ -+void toi_extent_state_restore(int slot) -+{ -+ int i = 0; -+ struct toi_bdev_info *cur_chain = prio_chain_head; -+ struct hibernate_extent_saved_state *chain_state; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "toi_extent_state_restore - slot %d.", slot); -+ -+ if (toi_writer_posn.saved_chain_number[slot] == -1) { -+ toi_writer_posn.current_chain = NULL; -+ return; -+ } -+ -+ while (cur_chain) { -+ int posn; -+ int j; -+ i++; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Restoring chain %d (%p) " -+ "state, slot %d.", i, cur_chain, slot); -+ -+ chain_state = &cur_chain->saved_state[slot]; -+ -+ posn = chain_state->extent_num; -+ -+ cur_chain->blocks.current_extent = cur_chain->blocks.first; -+ cur_chain->blocks.current_offset = chain_state->offset; -+ -+ if (i == toi_writer_posn.saved_chain_number[slot]) { -+ toi_writer_posn.current_chain = cur_chain; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Found current chain."); -+ } -+ -+ for (j = 0; j < 4; j++) -+ if (i == toi_writer_posn.saved_chain_number[j]) { -+ toi_writer_posn.saved_chain_ptr[j] = cur_chain; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Found saved chain ptr %d (%p) (offset" -+ " %d).", j, cur_chain, -+ cur_chain->saved_state[j].offset); -+ } -+ -+ if (posn) { -+ while (--posn) -+ cur_chain->blocks.current_extent = -+ cur_chain->blocks.current_extent->next; -+ } else -+ cur_chain->blocks.current_extent = NULL; -+ -+ cur_chain = cur_chain->next; -+ } -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Done."); -+ if (test_action_state(TOI_LOGALL)) -+ dump_block_chains(); -+} -+ -+/* -+ * Storage needed -+ * -+ * Returns amount of space in the image header required -+ * for the chain data. This ignores the links between -+ * pages, which we factor in when allocating the space. -+ */ -+int toi_bio_devinfo_storage_needed(void) -+{ -+ int result = sizeof(num_chains); -+ struct toi_bdev_info *chain = prio_chain_head; -+ -+ while (chain) { -+ result += metadata_size; -+ -+ /* Chain size */ -+ result += sizeof(int); -+ -+ /* Extents */ -+ result += (2 * sizeof(unsigned long) * -+ chain->blocks.num_extents); -+ -+ chain = chain->next; -+ } -+ -+ result += 4 * sizeof(int); -+ return result; -+} -+ -+static unsigned long chain_pages_used(struct toi_bdev_info *chain) -+{ -+ struct hibernate_extent *this = chain->blocks.first; -+ struct hibernate_extent_saved_state *state = &chain->saved_state[3]; -+ unsigned long size = 0; -+ int extent_idx = 1; -+ -+ if (!state->extent_num) { -+ if (!this) -+ return 0; -+ else -+ return chain->blocks.size; -+ } -+ -+ while (extent_idx < state->extent_num) { -+ size += (this->end - this->start + 1); -+ this = this->next; -+ extent_idx++; -+ } -+ -+ /* We didn't use the one we're sitting on, so don't count it */ -+ return size + state->offset - this->start; -+} -+ -+/** -+ * toi_serialise_extent_chain - write a chain in the image -+ * @chain: Chain to write. -+ **/ -+static int toi_serialise_extent_chain(struct toi_bdev_info *chain) -+{ -+ struct hibernate_extent *this; -+ int ret; -+ int i = 1; -+ -+ chain->pages_used = chain_pages_used(chain); -+ -+ if (test_action_state(TOI_LOGALL)) -+ dump_block_chains(); -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Serialising chain (dev_t %lx).", -+ chain->dev_t); -+ /* Device info - dev_t, prio, bmap_shift, blocks per page, positions */ -+ ret = toiActiveAllocator->rw_header_chunk(WRITE, &toi_blockwriter_ops, -+ (char *) &chain->uuid, metadata_size); -+ if (ret) -+ return ret; -+ -+ /* Num extents */ -+ ret = toiActiveAllocator->rw_header_chunk(WRITE, &toi_blockwriter_ops, -+ (char *) &chain->blocks.num_extents, sizeof(int)); -+ if (ret) -+ return ret; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "%d extents.", -+ chain->blocks.num_extents); -+ -+ this = chain->blocks.first; -+ while (this) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Extent %d.", i); -+ ret = toiActiveAllocator->rw_header_chunk(WRITE, -+ &toi_blockwriter_ops, -+ (char *) this, 2 * sizeof(this->start)); -+ if (ret) -+ return ret; -+ this = this->next; -+ i++; -+ } -+ -+ return ret; -+} -+ -+int toi_serialise_extent_chains(void) -+{ -+ struct toi_bdev_info *this = prio_chain_head; -+ int result; -+ -+ /* Write the number of chains */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Write number of chains (%d)", -+ num_chains); -+ result = toiActiveAllocator->rw_header_chunk(WRITE, -+ &toi_blockwriter_ops, (char *) &num_chains, -+ sizeof(int)); -+ if (result) -+ return result; -+ -+ /* Then the chains themselves */ -+ while (this) { -+ result = toi_serialise_extent_chain(this); -+ if (result) -+ return result; -+ this = this->next; -+ } -+ -+ /* -+ * Finally, the chain we should be on at the start of each -+ * section. -+ */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Saved chain numbers."); -+ result = toiActiveAllocator->rw_header_chunk(WRITE, -+ &toi_blockwriter_ops, -+ (char *) &toi_writer_posn.saved_chain_number[0], -+ 4 * sizeof(int)); -+ -+ return result; -+} -+ -+int toi_register_storage_chain(struct toi_bdev_info *new) -+{ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Inserting chain %p into list.", -+ new); -+ toi_insert_chain_in_prio_list(new); -+ return 0; -+} -+ -+static void free_bdev_info(struct toi_bdev_info *chain) -+{ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Free chain %p.", chain); -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " - Block extents."); -+ toi_put_extent_chain(&chain->blocks); -+ -+ /* -+ * The allocator may need to do more than just free the chains -+ * (swap_free, for example). Don't call from boot kernel. -+ */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " - Allocator extents."); -+ if (chain->allocator) -+ chain->allocator->bio_allocator_ops->free_storage(chain); -+ -+ /* -+ * Dropping out of reading atomic copy? Need to undo -+ * toi_open_by_devnum. -+ */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " - Bdev."); -+ if (chain->bdev && !IS_ERR(chain->bdev) && -+ chain->bdev != resume_block_device && -+ chain->bdev != header_block_device && -+ test_toi_state(TOI_TRYING_TO_RESUME)) -+ toi_close_bdev(chain->bdev); -+ -+ /* Poison */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " - Struct."); -+ toi_kfree(39, chain, sizeof(*chain)); -+ -+ if (prio_chain_head == chain) -+ prio_chain_head = NULL; -+ -+ num_chains--; -+} -+ -+void free_all_bdev_info(void) -+{ -+ struct toi_bdev_info *this = prio_chain_head; -+ -+ while (this) { -+ struct toi_bdev_info *next = this->next; -+ free_bdev_info(this); -+ this = next; -+ } -+ -+ memset((char *) &toi_writer_posn, 0, sizeof(toi_writer_posn)); -+ prio_chain_head = NULL; -+} -+ -+static void set_up_start_position(void) -+{ -+ toi_writer_posn.current_chain = prio_chain_head; -+ go_next_page(0, 0); -+} -+ -+/** -+ * toi_load_extent_chain - read back a chain saved in the image -+ * @chain: Chain to load -+ * -+ * The linked list of extents is reconstructed from the disk. chain will point -+ * to the first entry. -+ **/ -+int toi_load_extent_chain(int index, int *num_loaded) -+{ -+ struct toi_bdev_info *chain = toi_kzalloc(39, -+ sizeof(struct toi_bdev_info), GFP_ATOMIC); -+ struct hibernate_extent *this, *last = NULL; -+ int i, ret; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Loading extent chain %d.", index); -+ /* Get dev_t, prio, bmap_shift, blocks per page, positions */ -+ ret = toiActiveAllocator->rw_header_chunk_noreadahead(READ, NULL, -+ (char *) &chain->uuid, metadata_size); -+ -+ if (ret) { -+ printk(KERN_ERR "Failed to read the size of extent chain.\n"); -+ toi_kfree(39, chain, sizeof(*chain)); -+ return 1; -+ } -+ -+ toi_bkd.pages_used[index] = chain->pages_used; -+ -+ ret = toiActiveAllocator->rw_header_chunk_noreadahead(READ, NULL, -+ (char *) &chain->blocks.num_extents, sizeof(int)); -+ if (ret) { -+ printk(KERN_ERR "Failed to read the size of extent chain.\n"); -+ toi_kfree(39, chain, sizeof(*chain)); -+ return 1; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "%d extents.", -+ chain->blocks.num_extents); -+ -+ for (i = 0; i < chain->blocks.num_extents; i++) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Extent %d.", i + 1); -+ -+ this = toi_kzalloc(2, sizeof(struct hibernate_extent), -+ TOI_ATOMIC_GFP); -+ if (!this) { -+ printk(KERN_INFO "Failed to allocate a new extent.\n"); -+ free_bdev_info(chain); -+ return -ENOMEM; -+ } -+ this->next = NULL; -+ /* Get the next page */ -+ ret = toiActiveAllocator->rw_header_chunk_noreadahead(READ, -+ NULL, (char *) this, 2 * sizeof(this->start)); -+ if (ret) { -+ printk(KERN_INFO "Failed to read an extent.\n"); -+ toi_kfree(2, this, sizeof(struct hibernate_extent)); -+ free_bdev_info(chain); -+ return 1; -+ } -+ -+ if (last) -+ last->next = this; -+ else { -+ char b1[32], b2[32], b3[32]; -+ /* -+ * Open the bdev -+ */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Chain dev_t is %s. Resume dev t is %s. Header" -+ " bdev_t is %s.\n", -+ format_dev_t(b1, chain->dev_t), -+ format_dev_t(b2, resume_dev_t), -+ format_dev_t(b3, toi_sig_data->header_dev_t)); -+ -+ if (chain->dev_t == resume_dev_t) -+ chain->bdev = resume_block_device; -+ else if (chain->dev_t == toi_sig_data->header_dev_t) -+ chain->bdev = header_block_device; -+ else { -+ chain->bdev = toi_open_bdev(chain->uuid, -+ chain->dev_t, 1); -+ if (IS_ERR(chain->bdev)) { -+ free_bdev_info(chain); -+ return -ENODEV; -+ } -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Chain bmap shift " -+ "is %d and blocks per page is %d.", -+ chain->bmap_shift, -+ chain->blocks_per_page); -+ -+ chain->blocks.first = this; -+ -+ /* -+ * Couldn't do this earlier, but can't do -+ * goto_start now - we may have already used blocks -+ * in the first chain. -+ */ -+ chain->blocks.current_extent = this; -+ chain->blocks.current_offset = this->start; -+ -+ /* -+ * Can't wait until we've read the whole chain -+ * before we insert it in the list. We might need -+ * this chain to read the next page in the header -+ */ -+ toi_insert_chain_in_prio_list(chain); -+ } -+ -+ /* -+ * We have to wait until 2 extents are loaded before setting up -+ * properly because if the first extent has only one page, we -+ * will need to put the position on the second extent. Sounds -+ * obvious, but it wasn't! -+ */ -+ (*num_loaded)++; -+ if ((*num_loaded) == 2) -+ set_up_start_position(); -+ last = this; -+ } -+ -+ /* -+ * Shouldn't get empty chains, but it's not impossible. Link them in so -+ * they get freed properly later. -+ */ -+ if (!chain->blocks.num_extents) -+ toi_insert_chain_in_prio_list(chain); -+ -+ if (!chain->blocks.current_extent) { -+ chain->blocks.current_extent = chain->blocks.first; -+ if (chain->blocks.current_extent) -+ chain->blocks.current_offset = -+ chain->blocks.current_extent->start; -+ } -+ return 0; -+} -+ -+int toi_load_extent_chains(void) -+{ -+ int result; -+ int to_load; -+ int i; -+ int extents_loaded = 0; -+ -+ result = toiActiveAllocator->rw_header_chunk_noreadahead(READ, NULL, -+ (char *) &to_load, -+ sizeof(int)); -+ if (result) -+ return result; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "%d chains to read.", to_load); -+ -+ for (i = 0; i < to_load; i++) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " >> Loading chain %d/%d.", -+ i, to_load); -+ result = toi_load_extent_chain(i, &extents_loaded); -+ if (result) -+ return result; -+ } -+ -+ /* If we never got to a second extent, we still need to do this. */ -+ if (extents_loaded == 1) -+ set_up_start_position(); -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Save chain numbers."); -+ result = toiActiveAllocator->rw_header_chunk_noreadahead(READ, -+ &toi_blockwriter_ops, -+ (char *) &toi_writer_posn.saved_chain_number[0], -+ 4 * sizeof(int)); -+ -+ return result; -+} -+ -+static int toi_end_of_stream(int writing, int section_barrier) -+{ -+ struct toi_bdev_info *cur_chain = toi_writer_posn.current_chain; -+ int compare_to = next_section[current_stream]; -+ struct toi_bdev_info *compare_chain = -+ toi_writer_posn.saved_chain_ptr[compare_to]; -+ int compare_offset = compare_chain ? -+ compare_chain->saved_state[compare_to].offset : 0; -+ -+ if (!section_barrier) -+ return 0; -+ -+ if (!cur_chain) -+ return 1; -+ -+ if (cur_chain == compare_chain && -+ cur_chain->blocks.current_offset == compare_offset) { -+ if (writing) { -+ if (!current_stream) { -+ debug_broken_header(); -+ return 1; -+ } -+ } else { -+ more_readahead = 0; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Reached the end of stream %d " -+ "(not an error).", current_stream); -+ return 1; -+ } -+ } -+ -+ return 0; -+} -+ -+/** -+ * go_next_page - skip blocks to the start of the next page -+ * @writing: Whether we're reading or writing the image. -+ * -+ * Go forward one page. -+ **/ -+int go_next_page(int writing, int section_barrier) -+{ -+ struct toi_bdev_info *cur_chain = toi_writer_posn.current_chain; -+ int max = cur_chain ? cur_chain->blocks_per_page : 1; -+ -+ /* Nope. Go foward a page - or maybe two. Don't stripe the header, -+ * so that bad fragmentation doesn't put the extent data containing -+ * the location of the second page out of the first header page. -+ */ -+ if (toi_extent_state_next(max, current_stream)) { -+ /* Don't complain if readahead falls off the end */ -+ if (writing && section_barrier) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Extent state eof. " -+ "Expected compression ratio too optimistic?"); -+ if (test_action_state(TOI_LOGALL)) -+ dump_block_chains(); -+ } -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Ran out of extents to " -+ "read/write. (Not necessarily a fatal error."); -+ return -ENOSPC; -+ } -+ -+ return 0; -+} -+ -+int devices_of_same_priority(struct toi_bdev_info *this) -+{ -+ struct toi_bdev_info *check = prio_chain_head; -+ int i = 0; -+ -+ while (check) { -+ if (check->prio == this->prio) -+ i++; -+ check = check->next; -+ } -+ -+ return i; -+} -+ -+/** -+ * toi_bio_rw_page - do i/o on the next disk page in the image -+ * @writing: Whether reading or writing. -+ * @page: Page to do i/o on. -+ * @is_readahead: Whether we're doing readahead -+ * @free_group: The group used in allocating the page -+ * -+ * Submit a page for reading or writing, possibly readahead. -+ * Pass the group used in allocating the page as well, as it should -+ * be freed on completion of the bio if we're writing the page. -+ **/ -+int toi_bio_rw_page(int writing, struct page *page, -+ int is_readahead, int free_group) -+{ -+ int result = toi_end_of_stream(writing, 1); -+ struct toi_bdev_info *dev_info = toi_writer_posn.current_chain; -+ -+ if (result) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Seeking to read/write " -+ "another page when stream has ended."); -+ return -ENOSPC; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "%sing device %lx, sector %d << %d.", -+ writing ? "Writ" : "Read", -+ dev_info->bdev, dev_info->blocks.current_offset, -+ dev_info->bmap_shift); -+ -+ result = toi_do_io(writing, dev_info->bdev, -+ dev_info->blocks.current_offset << dev_info->bmap_shift, -+ page, is_readahead, 0, free_group); -+ -+ /* Ignore the result here - will check end of stream if come in again */ -+ go_next_page(writing, 1); -+ -+ if (result) -+ printk(KERN_ERR "toi_do_io returned %d.\n", result); -+ return result; -+} -+ -+dev_t get_header_dev_t(void) -+{ -+ return prio_chain_head->dev_t; -+} -+ -+struct block_device *get_header_bdev(void) -+{ -+ return prio_chain_head->bdev; -+} -+ -+unsigned long get_headerblock(void) -+{ -+ return prio_chain_head->blocks.first->start << -+ prio_chain_head->bmap_shift; -+} -+ -+int get_main_pool_phys_params(void) -+{ -+ struct toi_bdev_info *this = prio_chain_head; -+ int result; -+ -+ while (this) { -+ result = this->allocator->bio_allocator_ops->bmap(this); -+ if (result) -+ return result; -+ this = this->next; -+ } -+ -+ return 0; -+} -+ -+static int apply_header_reservation(void) -+{ -+ int i; -+ -+ if (!header_pages_reserved) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "No header pages reserved at the moment."); -+ return 0; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Applying header reservation."); -+ -+ /* Apply header space reservation */ -+ toi_extent_state_goto_start(); -+ -+ for (i = 0; i < header_pages_reserved; i++) -+ if (go_next_page(1, 0)) -+ return -ENOSPC; -+ -+ /* The end of header pages will be the start of pageset 2 */ -+ toi_extent_state_save(2); -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Finished applying header reservation."); -+ return 0; -+} -+ -+static int toi_bio_register_storage(void) -+{ -+ int result = 0; -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || -+ this_module->type != BIO_ALLOCATOR_MODULE) -+ continue; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Registering storage from %s.", -+ this_module->name); -+ result = this_module->bio_allocator_ops->register_storage(); -+ if (result) -+ break; -+ } -+ -+ return result; -+} -+ -+int toi_bio_allocate_storage(unsigned long request) -+{ -+ struct toi_bdev_info *chain = prio_chain_head; -+ unsigned long to_get = request; -+ unsigned long extra_pages, needed; -+ int no_free = 0; -+ -+ if (!chain) { -+ int result = toi_bio_register_storage(); -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_allocate_storage: " -+ "Registering storage."); -+ if (result) -+ return 0; -+ chain = prio_chain_head; -+ if (!chain) { -+ printk("TuxOnIce: No storage was registered.\n"); -+ return 0; -+ } -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_allocate_storage: " -+ "Request is %lu pages.", request); -+ extra_pages = DIV_ROUND_UP(request * (sizeof(unsigned long) -+ + sizeof(int)), PAGE_SIZE); -+ needed = request + extra_pages + header_pages_reserved; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Adding %lu extra pages and %lu " -+ "for header => %lu.", -+ extra_pages, header_pages_reserved, needed); -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Already allocated %lu pages.", -+ raw_pages_allocd); -+ -+ to_get = needed > raw_pages_allocd ? needed - raw_pages_allocd : 0; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Need to get %lu pages.", to_get); -+ -+ if (!to_get) -+ return apply_header_reservation(); -+ -+ while (to_get && chain) { -+ int num_group = devices_of_same_priority(chain); -+ int divisor = num_group - no_free; -+ int i; -+ unsigned long portion = DIV_ROUND_UP(to_get, divisor); -+ unsigned long got = 0; -+ unsigned long got_this_round = 0; -+ struct toi_bdev_info *top = chain; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ " Start of loop. To get is %lu. Divisor is %d.", -+ to_get, divisor); -+ no_free = 0; -+ -+ /* -+ * We're aiming to spread the allocated storage as evenly -+ * as possible, but we also want to get all the storage we -+ * can off this priority. -+ */ -+ for (i = 0; i < num_group; i++) { -+ struct toi_bio_allocator_ops *ops = -+ chain->allocator->bio_allocator_ops; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ " Asking for %lu pages from chain %p.", -+ portion, chain); -+ got = ops->allocate_storage(chain, portion); -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ " Got %lu pages from allocator %p.", -+ got, chain); -+ if (!got) -+ no_free++; -+ got_this_round += got; -+ chain = chain->next; -+ } -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " Loop finished. Got a " -+ "total of %lu pages from %d allocators.", -+ got_this_round, divisor - no_free); -+ -+ raw_pages_allocd += got_this_round; -+ to_get = needed > raw_pages_allocd ? needed - raw_pages_allocd : -+ 0; -+ -+ /* -+ * If we got anything from chains of this priority and we -+ * still have storage to allocate, go over this priority -+ * again. -+ */ -+ if (got_this_round && to_get) -+ chain = top; -+ else -+ no_free = 0; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Finished allocating. Calling " -+ "get_main_pool_phys_params"); -+ /* Now let swap allocator bmap the pages */ -+ get_main_pool_phys_params(); -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Done. Reserving header."); -+ return apply_header_reservation(); -+} -+ -+void toi_bio_chains_post_atomic(struct toi_boot_kernel_data *bkd) -+{ -+ int i = 0; -+ struct toi_bdev_info *cur_chain = prio_chain_head; -+ -+ while (cur_chain) { -+ cur_chain->pages_used = bkd->pages_used[i]; -+ cur_chain = cur_chain->next; -+ i++; -+ } -+} -+ -+int toi_bio_chains_debug_info(char *buffer, int size) -+{ -+ /* Show what we actually used */ -+ struct toi_bdev_info *cur_chain = prio_chain_head; -+ int len = 0; -+ -+ while (cur_chain) { -+ len += scnprintf(buffer + len, size - len, " Used %lu pages " -+ "from %s.\n", cur_chain->pages_used, -+ cur_chain->name); -+ cur_chain = cur_chain->next; -+ } -+ -+ return len; -+} -diff --git a/kernel/power/tuxonice_bio_core.c b/kernel/power/tuxonice_bio_core.c -new file mode 100644 -index 0000000..b8ae996 ---- /dev/null -+++ b/kernel/power/tuxonice_bio_core.c -@@ -0,0 +1,1810 @@ -+/* -+ * kernel/power/tuxonice_bio.c -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * This file contains block io functions for TuxOnIce. These are -+ * used by the swapwriter and it is planned that they will also -+ * be used by the NFSwriter. -+ * -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice_bio.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_io.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_bio_internal.h" -+ -+#define MEMORY_ONLY 1 -+#define THROTTLE_WAIT 2 -+ -+/* #define MEASURE_MUTEX_CONTENTION */ -+#ifndef MEASURE_MUTEX_CONTENTION -+#define my_mutex_lock(index, the_lock) mutex_lock(the_lock) -+#define my_mutex_unlock(index, the_lock) mutex_unlock(the_lock) -+#else -+unsigned long mutex_times[2][2][NR_CPUS]; -+#define my_mutex_lock(index, the_lock) do { \ -+ int have_mutex; \ -+ have_mutex = mutex_trylock(the_lock); \ -+ if (!have_mutex) { \ -+ mutex_lock(the_lock); \ -+ mutex_times[index][0][smp_processor_id()]++; \ -+ } else { \ -+ mutex_times[index][1][smp_processor_id()]++; \ -+ } -+ -+#define my_mutex_unlock(index, the_lock) \ -+ mutex_unlock(the_lock); \ -+} while (0) -+#endif -+ -+static int page_idx, reset_idx; -+ -+static int target_outstanding_io = 1024; -+static int max_outstanding_writes, max_outstanding_reads; -+ -+static struct page *bio_queue_head, *bio_queue_tail; -+static atomic_t toi_bio_queue_size; -+static DEFINE_SPINLOCK(bio_queue_lock); -+ -+static int free_mem_throttle, throughput_throttle; -+int more_readahead = 1; -+static struct page *readahead_list_head, *readahead_list_tail; -+ -+static struct page *waiting_on; -+ -+static atomic_t toi_io_in_progress, toi_io_done; -+static DECLARE_WAIT_QUEUE_HEAD(num_in_progress_wait); -+ -+int current_stream; -+/* Not static, so that the allocators can setup and complete -+ * writing the header */ -+char *toi_writer_buffer; -+int toi_writer_buffer_posn; -+ -+static DEFINE_MUTEX(toi_bio_mutex); -+static DEFINE_MUTEX(toi_bio_readahead_mutex); -+ -+static struct task_struct *toi_queue_flusher; -+static int toi_bio_queue_flush_pages(int dedicated_thread); -+ -+struct toi_module_ops toi_blockwriter_ops; -+ -+#define TOTAL_OUTSTANDING_IO (atomic_read(&toi_io_in_progress) + \ -+ atomic_read(&toi_bio_queue_size)) -+ -+unsigned long raw_pages_allocd, header_pages_reserved; -+ -+/** -+ * set_free_mem_throttle - set the point where we pause to avoid oom. -+ * -+ * Initially, this value is zero, but when we first fail to allocate memory, -+ * we set it (plus a buffer) and thereafter throttle i/o once that limit is -+ * reached. -+ **/ -+static void set_free_mem_throttle(void) -+{ -+ int new_throttle = nr_unallocated_buffer_pages() + 256; -+ -+ if (new_throttle > free_mem_throttle) -+ free_mem_throttle = new_throttle; -+} -+ -+#define NUM_REASONS 7 -+static atomic_t reasons[NUM_REASONS]; -+static char *reason_name[NUM_REASONS] = { -+ "readahead not ready", -+ "bio allocation", -+ "synchronous I/O", -+ "toi_bio_get_new_page", -+ "memory low", -+ "readahead buffer allocation", -+ "throughput_throttle", -+}; -+ -+/* User Specified Parameters. */ -+unsigned long resume_firstblock; -+dev_t resume_dev_t; -+struct block_device *resume_block_device; -+static atomic_t resume_bdev_open_count; -+ -+struct block_device *header_block_device; -+ -+/** -+ * toi_open_bdev: Open a bdev at resume time. -+ * -+ * index: The swap index. May be MAX_SWAPFILES for the resume_dev_t -+ * (the user can have resume= pointing at a swap partition/file that isn't -+ * swapon'd when they hibernate. MAX_SWAPFILES+1 for the first page of the -+ * header. It will be from a swap partition that was enabled when we hibernated, -+ * but we don't know it's real index until we read that first page. -+ * dev_t: The device major/minor. -+ * display_errs: Whether to try to do this quietly. -+ * -+ * We stored a dev_t in the image header. Open the matching device without -+ * requiring /dev/ in most cases and record the details needed -+ * to close it later and avoid duplicating work. -+ */ -+struct block_device *toi_open_bdev(char *uuid, dev_t default_device, -+ int display_errs) -+{ -+ struct block_device *bdev; -+ dev_t device = default_device; -+ char buf[32]; -+ -+ if (uuid) { -+ device = blk_lookup_uuid(uuid); -+ if (!device) { -+ device = default_device; -+ printk(KERN_DEBUG "Unable to resolve uuid. Falling back" -+ " to dev_t.\n"); -+ } else -+ printk(KERN_DEBUG "Resolved uuid to device %s.\n", -+ format_dev_t(buf, device)); -+ } -+ -+ if (!device) { -+ printk(KERN_ERR "TuxOnIce attempting to open a " -+ "blank dev_t!\n"); -+ dump_stack(); -+ return NULL; -+ } -+ bdev = toi_open_by_devnum(device); -+ -+ if (IS_ERR(bdev) || !bdev) { -+ if (display_errs) -+ toi_early_boot_message(1, TOI_CONTINUE_REQ, -+ "Failed to get access to block device " -+ "\"%x\" (error %d).\n Maybe you need " -+ "to run mknod and/or lvmsetup in an " -+ "initrd/ramfs?", device, bdev); -+ return ERR_PTR(-EINVAL); -+ } -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "TuxOnIce got bdev %p for dev_t %x.", -+ bdev, device); -+ -+ return bdev; -+} -+ -+static void toi_bio_reserve_header_space(unsigned long request) -+{ -+ header_pages_reserved = request; -+} -+ -+/** -+ * do_bio_wait - wait for some TuxOnIce I/O to complete -+ * @reason: The array index of the reason we're waiting. -+ * -+ * Wait for a particular page of I/O if we're after a particular page. -+ * If we're not after a particular page, wait instead for all in flight -+ * I/O to be completed or for us to have enough free memory to be able -+ * to submit more I/O. -+ * -+ * If we wait, we also update our statistics regarding why we waited. -+ **/ -+static void do_bio_wait(int reason) -+{ -+ struct page *was_waiting_on = waiting_on; -+ -+ /* On SMP, waiting_on can be reset, so we make a copy */ -+ if (was_waiting_on) { -+ wait_on_page_locked(was_waiting_on); -+ atomic_inc(&reasons[reason]); -+ } else { -+ atomic_inc(&reasons[reason]); -+ -+ wait_event(num_in_progress_wait, -+ !atomic_read(&toi_io_in_progress) || -+ nr_unallocated_buffer_pages() > free_mem_throttle); -+ } -+} -+ -+/** -+ * throttle_if_needed - wait for I/O completion if throttle points are reached -+ * @flags: What to check and how to act. -+ * -+ * Check whether we need to wait for some I/O to complete. We always check -+ * whether we have enough memory available, but may also (depending upon -+ * @reason) check if the throughput throttle limit has been reached. -+ **/ -+static int throttle_if_needed(int flags) -+{ -+ int free_pages = nr_unallocated_buffer_pages(); -+ -+ /* Getting low on memory and I/O is in progress? */ -+ while (unlikely(free_pages < free_mem_throttle) && -+ atomic_read(&toi_io_in_progress) && -+ !test_result_state(TOI_ABORTED)) { -+ if (!(flags & THROTTLE_WAIT)) -+ return -ENOMEM; -+ do_bio_wait(4); -+ free_pages = nr_unallocated_buffer_pages(); -+ } -+ -+ while (!(flags & MEMORY_ONLY) && throughput_throttle && -+ TOTAL_OUTSTANDING_IO >= throughput_throttle && -+ !test_result_state(TOI_ABORTED)) { -+ int result = toi_bio_queue_flush_pages(0); -+ if (result) -+ return result; -+ atomic_inc(&reasons[6]); -+ wait_event(num_in_progress_wait, -+ !atomic_read(&toi_io_in_progress) || -+ TOTAL_OUTSTANDING_IO < throughput_throttle); -+ } -+ -+ return 0; -+} -+ -+/** -+ * update_throughput_throttle - update the raw throughput throttle -+ * @jif_index: The number of times this function has been called. -+ * -+ * This function is called four times per second by the core, and used to limit -+ * the amount of I/O we submit at once, spreading out our waiting through the -+ * whole job and letting userui get an opportunity to do its work. -+ * -+ * We don't start limiting I/O until 1/4s has gone so that we get a -+ * decent sample for our initial limit, and keep updating it because -+ * throughput may vary (on rotating media, eg) with our block number. -+ * -+ * We throttle to 1/10s worth of I/O. -+ **/ -+static void update_throughput_throttle(int jif_index) -+{ -+ int done = atomic_read(&toi_io_done); -+ throughput_throttle = done * 2 / 5 / jif_index; -+} -+ -+/** -+ * toi_finish_all_io - wait for all outstanding i/o to complete -+ * -+ * Flush any queued but unsubmitted I/O and wait for it all to complete. -+ **/ -+static int toi_finish_all_io(void) -+{ -+ int result = toi_bio_queue_flush_pages(0); -+ wait_event(num_in_progress_wait, !TOTAL_OUTSTANDING_IO); -+ return result; -+} -+ -+/** -+ * toi_end_bio - bio completion function. -+ * @bio: bio that has completed. -+ * @err: Error value. Yes, like end_swap_bio_read, we ignore it. -+ * -+ * Function called by the block driver from interrupt context when I/O is -+ * completed. If we were writing the page, we want to free it and will have -+ * set bio->bi_private to the parameter we should use in telling the page -+ * allocation accounting code what the page was allocated for. If we're -+ * reading the page, it will be in the singly linked list made from -+ * page->private pointers. -+ **/ -+static void toi_end_bio(struct bio *bio, int err) -+{ -+ struct page *page = bio->bi_io_vec[0].bv_page; -+ -+ BUG_ON(!test_bit(BIO_UPTODATE, &bio->bi_flags)); -+ -+ unlock_page(page); -+ bio_put(bio); -+ -+ if (waiting_on == page) -+ waiting_on = NULL; -+ -+ put_page(page); -+ -+ if (bio->bi_private) -+ toi__free_page((int) ((unsigned long) bio->bi_private) , page); -+ -+ bio_put(bio); -+ -+ atomic_dec(&toi_io_in_progress); -+ atomic_inc(&toi_io_done); -+ -+ wake_up(&num_in_progress_wait); -+} -+ -+/** -+ * submit - submit BIO request -+ * @writing: READ or WRITE. -+ * @dev: The block device we're using. -+ * @first_block: The first sector we're using. -+ * @page: The page being used for I/O. -+ * @free_group: If writing, the group that was used in allocating the page -+ * and which will be used in freeing the page from the completion -+ * routine. -+ * -+ * Based on Patrick Mochell's pmdisk code from long ago: "Straight from the -+ * textbook - allocate and initialize the bio. If we're writing, make sure -+ * the page is marked as dirty. Then submit it and carry on." -+ * -+ * If we're just testing the speed of our own code, we fake having done all -+ * the hard work and all toi_end_bio immediately. -+ **/ -+static int submit(int writing, struct block_device *dev, sector_t first_block, -+ struct page *page, int free_group) -+{ -+ struct bio *bio = NULL; -+ int cur_outstanding_io, result; -+ -+ /* -+ * Shouldn't throttle if reading - can deadlock in the single -+ * threaded case as pages are only freed when we use the -+ * readahead. -+ */ -+ if (writing) { -+ result = throttle_if_needed(MEMORY_ONLY | THROTTLE_WAIT); -+ if (result) -+ return result; -+ } -+ -+ while (!bio) { -+ bio = bio_alloc(TOI_ATOMIC_GFP, 1); -+ if (!bio) { -+ set_free_mem_throttle(); -+ do_bio_wait(1); -+ } -+ } -+ -+ bio->bi_bdev = dev; -+ bio->bi_sector = first_block; -+ bio->bi_private = (void *) ((unsigned long) free_group); -+ bio->bi_end_io = toi_end_bio; -+ -+ if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { -+ printk(KERN_DEBUG "ERROR: adding page to bio at %lld\n", -+ (unsigned long long) first_block); -+ bio_put(bio); -+ return -EFAULT; -+ } -+ -+ bio_get(bio); -+ -+ cur_outstanding_io = atomic_add_return(1, &toi_io_in_progress); -+ if (writing) { -+ if (cur_outstanding_io > max_outstanding_writes) -+ max_outstanding_writes = cur_outstanding_io; -+ } else { -+ if (cur_outstanding_io > max_outstanding_reads) -+ max_outstanding_reads = cur_outstanding_io; -+ } -+ -+ -+ if (unlikely(test_action_state(TOI_TEST_BIO))) { -+ /* Fake having done the hard work */ -+ set_bit(BIO_UPTODATE, &bio->bi_flags); -+ toi_end_bio(bio, 0); -+ } else -+ submit_bio(writing | (1 << BIO_RW_SYNCIO) | -+ (1 << BIO_RW_TUXONICE) | -+ (1 << BIO_RW_UNPLUG), bio); -+ -+ return 0; -+} -+ -+/** -+ * toi_do_io: Prepare to do some i/o on a page and submit or batch it. -+ * -+ * @writing: Whether reading or writing. -+ * @bdev: The block device which we're using. -+ * @block0: The first sector we're reading or writing. -+ * @page: The page on which I/O is being done. -+ * @readahead_index: If doing readahead, the index (reset this flag when done). -+ * @syncio: Whether the i/o is being done synchronously. -+ * -+ * Prepare and start a read or write operation. -+ * -+ * Note that we always work with our own page. If writing, we might be given a -+ * compression buffer that will immediately be used to start compressing the -+ * next page. For reading, we do readahead and therefore don't know the final -+ * address where the data needs to go. -+ **/ -+int toi_do_io(int writing, struct block_device *bdev, long block0, -+ struct page *page, int is_readahead, int syncio, int free_group) -+{ -+ page->private = 0; -+ -+ /* Do here so we don't race against toi_bio_get_next_page_read */ -+ lock_page(page); -+ -+ if (is_readahead) { -+ if (readahead_list_head) -+ readahead_list_tail->private = (unsigned long) page; -+ else -+ readahead_list_head = page; -+ -+ readahead_list_tail = page; -+ } -+ -+ /* Done before submitting to avoid races. */ -+ if (syncio) -+ waiting_on = page; -+ -+ /* Submit the page */ -+ get_page(page); -+ -+ if (submit(writing, bdev, block0, page, free_group)) -+ return -EFAULT; -+ -+ if (syncio) -+ do_bio_wait(2); -+ -+ return 0; -+} -+ -+/** -+ * toi_bdev_page_io - simpler interface to do directly i/o on a single page -+ * @writing: Whether reading or writing. -+ * @bdev: Block device on which we're operating. -+ * @pos: Sector at which page to read or write starts. -+ * @page: Page to be read/written. -+ * -+ * A simple interface to submit a page of I/O and wait for its completion. -+ * The caller must free the page used. -+ **/ -+static int toi_bdev_page_io(int writing, struct block_device *bdev, -+ long pos, struct page *page) -+{ -+ return toi_do_io(writing, bdev, pos, page, 0, 1, 0); -+} -+ -+/** -+ * toi_bio_memory_needed - report the amount of memory needed for block i/o -+ * -+ * We want to have at least enough memory so as to have target_outstanding_io -+ * or more transactions on the fly at once. If we can do more, fine. -+ **/ -+static int toi_bio_memory_needed(void) -+{ -+ return target_outstanding_io * (PAGE_SIZE + sizeof(struct request) + -+ sizeof(struct bio)); -+} -+ -+/** -+ * toi_bio_print_debug_stats - put out debugging info in the buffer provided -+ * @buffer: A buffer of size @size into which text should be placed. -+ * @size: The size of @buffer. -+ * -+ * Fill a buffer with debugging info. This is used for both our debug_info sysfs -+ * entry and for recording the same info in dmesg. -+ **/ -+static int toi_bio_print_debug_stats(char *buffer, int size) -+{ -+ int len = 0; -+ -+ if (toiActiveAllocator != &toi_blockwriter_ops) { -+ len = scnprintf(buffer, size, -+ "- Block I/O inactive.\n"); -+ return len; -+ } -+ -+ len = scnprintf(buffer, size, "- Block I/O active.\n"); -+ -+ len += toi_bio_chains_debug_info(buffer + len, size - len); -+ -+ len += scnprintf(buffer + len, size - len, -+ "- Max outstanding reads %d. Max writes %d.\n", -+ max_outstanding_reads, max_outstanding_writes); -+ -+ len += scnprintf(buffer + len, size - len, -+ " Memory_needed: %d x (%lu + %u + %u) = %d bytes.\n", -+ target_outstanding_io, -+ PAGE_SIZE, (unsigned int) sizeof(struct request), -+ (unsigned int) sizeof(struct bio), toi_bio_memory_needed()); -+ -+#ifdef MEASURE_MUTEX_CONTENTION -+ { -+ int i; -+ -+ len += scnprintf(buffer + len, size - len, -+ " Mutex contention while reading:\n Contended Free\n"); -+ -+ for_each_online_cpu(i) -+ len += scnprintf(buffer + len, size - len, -+ " %9lu %9lu\n", -+ mutex_times[0][0][i], mutex_times[0][1][i]); -+ -+ len += scnprintf(buffer + len, size - len, -+ " Mutex contention while writing:\n Contended Free\n"); -+ -+ for_each_online_cpu(i) -+ len += scnprintf(buffer + len, size - len, -+ " %9lu %9lu\n", -+ mutex_times[1][0][i], mutex_times[1][1][i]); -+ -+ } -+#endif -+ -+ return len + scnprintf(buffer + len, size - len, -+ " Free mem throttle point reached %d.\n", free_mem_throttle); -+} -+ -+static int total_header_bytes; -+static int unowned; -+ -+void debug_broken_header(void) -+{ -+ printk(KERN_DEBUG "Image header too big for size allocated!\n"); -+ print_toi_header_storage_for_modules(); -+ printk(KERN_DEBUG "Page flags : %d.\n", toi_pageflags_space_needed()); -+ printk(KERN_DEBUG "toi_header : %zu.\n", sizeof(struct toi_header)); -+ printk(KERN_DEBUG "Total unowned : %d.\n", unowned); -+ printk(KERN_DEBUG "Total used : %d (%ld pages).\n", total_header_bytes, -+ DIV_ROUND_UP(total_header_bytes, PAGE_SIZE)); -+ printk(KERN_DEBUG "Space needed now : %ld.\n", -+ get_header_storage_needed()); -+ dump_block_chains(); -+ abort_hibernate(TOI_HEADER_TOO_BIG, "Header reservation too small."); -+} -+ -+/** -+ * toi_rw_init - prepare to read or write a stream in the image -+ * @writing: Whether reading or writing. -+ * @stream number: Section of the image being processed. -+ * -+ * Prepare to read or write a section ('stream') in the image. -+ **/ -+static int toi_rw_init(int writing, int stream_number) -+{ -+ if (stream_number) -+ toi_extent_state_restore(stream_number); -+ else -+ toi_extent_state_goto_start(); -+ -+ if (writing) { -+ reset_idx = 0; -+ if (!current_stream) -+ page_idx = 0; -+ } else { -+ reset_idx = 1; -+ } -+ -+ atomic_set(&toi_io_done, 0); -+ if (!toi_writer_buffer) -+ toi_writer_buffer = (char *) toi_get_zeroed_page(11, -+ TOI_ATOMIC_GFP); -+ toi_writer_buffer_posn = writing ? 0 : PAGE_SIZE; -+ -+ current_stream = stream_number; -+ -+ more_readahead = 1; -+ -+ return toi_writer_buffer ? 0 : -ENOMEM; -+} -+ -+/** -+ * toi_bio_queue_write - queue a page for writing -+ * @full_buffer: Pointer to a page to be queued -+ * -+ * Add a page to the queue to be submitted. If we're the queue flusher, -+ * we'll do this once we've dropped toi_bio_mutex, so other threads can -+ * continue to submit I/O while we're on the slow path doing the actual -+ * submission. -+ **/ -+static void toi_bio_queue_write(char **full_buffer) -+{ -+ struct page *page = virt_to_page(*full_buffer); -+ unsigned long flags; -+ -+ *full_buffer = NULL; -+ page->private = 0; -+ -+ spin_lock_irqsave(&bio_queue_lock, flags); -+ if (!bio_queue_head) -+ bio_queue_head = page; -+ else -+ bio_queue_tail->private = (unsigned long) page; -+ -+ bio_queue_tail = page; -+ atomic_inc(&toi_bio_queue_size); -+ -+ spin_unlock_irqrestore(&bio_queue_lock, flags); -+ wake_up(&toi_io_queue_flusher); -+} -+ -+/** -+ * toi_rw_cleanup - Cleanup after i/o. -+ * @writing: Whether we were reading or writing. -+ * -+ * Flush all I/O and clean everything up after reading or writing a -+ * section of the image. -+ **/ -+static int toi_rw_cleanup(int writing) -+{ -+ int i, result = 0; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_rw_cleanup."); -+ if (writing) { -+ if (toi_writer_buffer_posn && !test_result_state(TOI_ABORTED)) -+ toi_bio_queue_write(&toi_writer_buffer); -+ -+ while (bio_queue_head && !result) -+ result = toi_bio_queue_flush_pages(0); -+ -+ if (result) -+ return result; -+ -+ if (current_stream == 2) -+ toi_extent_state_save(1); -+ else if (current_stream == 1) -+ toi_extent_state_save(3); -+ } -+ -+ result = toi_finish_all_io(); -+ -+ while (readahead_list_head) { -+ void *next = (void *) readahead_list_head->private; -+ toi__free_page(12, readahead_list_head); -+ readahead_list_head = next; -+ } -+ -+ readahead_list_tail = NULL; -+ -+ if (!current_stream) -+ return result; -+ -+ for (i = 0; i < NUM_REASONS; i++) { -+ if (!atomic_read(&reasons[i])) -+ continue; -+ printk(KERN_DEBUG "Waited for i/o due to %s %d times.\n", -+ reason_name[i], atomic_read(&reasons[i])); -+ atomic_set(&reasons[i], 0); -+ } -+ -+ current_stream = 0; -+ return result; -+} -+ -+/** -+ * toi_start_one_readahead - start one page of readahead -+ * @dedicated_thread: Is this a thread dedicated to doing readahead? -+ * -+ * Start one new page of readahead. If this is being called by a thread -+ * whose only just is to submit readahead, don't quit because we failed -+ * to allocate a page. -+ **/ -+static int toi_start_one_readahead(int dedicated_thread) -+{ -+ char *buffer = NULL; -+ int oom = 0, result; -+ -+ result = throttle_if_needed(dedicated_thread ? THROTTLE_WAIT : 0); -+ if (result) -+ return result; -+ -+ mutex_lock(&toi_bio_readahead_mutex); -+ -+ while (!buffer) { -+ buffer = (char *) toi_get_zeroed_page(12, -+ TOI_ATOMIC_GFP); -+ if (!buffer) { -+ if (oom && !dedicated_thread) { -+ mutex_unlock(&toi_bio_readahead_mutex); -+ return -ENOMEM; -+ } -+ -+ oom = 1; -+ set_free_mem_throttle(); -+ do_bio_wait(5); -+ } -+ } -+ -+ result = toi_bio_rw_page(READ, virt_to_page(buffer), 1, 0); -+ if (result == -ENOSPC) -+ toi__free_page(12, virt_to_page(buffer)); -+ mutex_unlock(&toi_bio_readahead_mutex); -+ if (result) { -+ if (result == -ENOSPC) -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Last readahead page submitted."); -+ else -+ printk(KERN_DEBUG "toi_bio_rw_page returned %d.\n", -+ result); -+ } -+ return result; -+} -+ -+/** -+ * toi_start_new_readahead - start new readahead -+ * @dedicated_thread: Are we dedicated to this task? -+ * -+ * Start readahead of image pages. -+ * -+ * We can be called as a thread dedicated to this task (may be helpful on -+ * systems with lots of CPUs), in which case we don't exit until there's no -+ * more readahead. -+ * -+ * If this is not called by a dedicated thread, we top up our queue until -+ * there's no more readahead to submit, we've submitted the number given -+ * in target_outstanding_io or the number in progress exceeds the target -+ * outstanding I/O value. -+ * -+ * No mutex needed because this is only ever called by the first cpu. -+ **/ -+static int toi_start_new_readahead(int dedicated_thread) -+{ -+ int last_result, num_submitted = 0; -+ -+ /* Start a new readahead? */ -+ if (!more_readahead) -+ return 0; -+ -+ do { -+ last_result = toi_start_one_readahead(dedicated_thread); -+ -+ if (last_result) { -+ if (last_result == -ENOMEM || last_result == -ENOSPC) -+ return 0; -+ -+ printk(KERN_DEBUG -+ "Begin read chunk returned %d.\n", -+ last_result); -+ } else -+ num_submitted++; -+ -+ } while (more_readahead && !last_result && -+ (dedicated_thread || -+ (num_submitted < target_outstanding_io && -+ atomic_read(&toi_io_in_progress) < target_outstanding_io))); -+ -+ return last_result; -+} -+ -+/** -+ * bio_io_flusher - start the dedicated I/O flushing routine -+ * @writing: Whether we're writing the image. -+ **/ -+static int bio_io_flusher(int writing) -+{ -+ -+ if (writing) -+ return toi_bio_queue_flush_pages(1); -+ else -+ return toi_start_new_readahead(1); -+} -+ -+/** -+ * toi_bio_get_next_page_read - read a disk page, perhaps with readahead -+ * @no_readahead: Whether we can use readahead -+ * -+ * Read a page from disk, submitting readahead and cleaning up finished i/o -+ * while we wait for the page we're after. -+ **/ -+static int toi_bio_get_next_page_read(int no_readahead) -+{ -+ unsigned long *virt; -+ struct page *next; -+ -+ /* -+ * When reading the second page of the header, we have to -+ * delay submitting the read until after we've gotten the -+ * extents out of the first page. -+ */ -+ if (unlikely(no_readahead && toi_start_one_readahead(0))) { -+ printk(KERN_EMERG "No readahead and toi_start_one_readahead " -+ "returned non-zero.\n"); -+ return -EIO; -+ } -+ -+ if (unlikely(!readahead_list_head)) { -+ /* -+ * If the last page finishes exactly on the page -+ * boundary, we will be called one extra time and -+ * have no data to return. In this case, we should -+ * not BUG(), like we used to! -+ */ -+ if (!more_readahead) { -+ printk(KERN_EMERG "No more readahead.\n"); -+ return -ENOSPC; -+ } -+ if (unlikely(toi_start_one_readahead(0))) { -+ printk(KERN_EMERG "No readahead and " -+ "toi_start_one_readahead returned non-zero.\n"); -+ return -EIO; -+ } -+ } -+ -+ if (PageLocked(readahead_list_head)) { -+ waiting_on = readahead_list_head; -+ do_bio_wait(0); -+ } -+ -+ virt = page_address(readahead_list_head); -+ memcpy(toi_writer_buffer, virt, PAGE_SIZE); -+ -+ next = (struct page *) readahead_list_head->private; -+ toi__free_page(12, readahead_list_head); -+ readahead_list_head = next; -+ return 0; -+} -+ -+/** -+ * toi_bio_queue_flush_pages - flush the queue of pages queued for writing -+ * @dedicated_thread: Whether we're a dedicated thread -+ * -+ * Flush the queue of pages ready to be written to disk. -+ * -+ * If we're a dedicated thread, stay in here until told to leave, -+ * sleeping in wait_event. -+ * -+ * The first thread is normally the only one to come in here. Another -+ * thread can enter this routine too, though, via throttle_if_needed. -+ * Since that's the case, we must be careful to only have one thread -+ * doing this work at a time. Otherwise we have a race and could save -+ * pages out of order. -+ * -+ * If an error occurs, free all remaining pages without submitting them -+ * for I/O. -+ **/ -+ -+int toi_bio_queue_flush_pages(int dedicated_thread) -+{ -+ unsigned long flags; -+ int result = 0; -+ static DEFINE_MUTEX(busy); -+ -+ if (!mutex_trylock(&busy)) -+ return 0; -+ -+top: -+ spin_lock_irqsave(&bio_queue_lock, flags); -+ while (bio_queue_head) { -+ struct page *page = bio_queue_head; -+ bio_queue_head = (struct page *) page->private; -+ if (bio_queue_tail == page) -+ bio_queue_tail = NULL; -+ atomic_dec(&toi_bio_queue_size); -+ spin_unlock_irqrestore(&bio_queue_lock, flags); -+ -+ /* Don't generate more error messages if already had one */ -+ if (!result) -+ result = toi_bio_rw_page(WRITE, page, 0, 11); -+ /* -+ * If writing the page failed, don't drop out. -+ * Flush the rest of the queue too. -+ */ -+ if (result) -+ toi__free_page(11 , page); -+ spin_lock_irqsave(&bio_queue_lock, flags); -+ } -+ spin_unlock_irqrestore(&bio_queue_lock, flags); -+ -+ if (dedicated_thread) { -+ wait_event(toi_io_queue_flusher, bio_queue_head || -+ toi_bio_queue_flusher_should_finish); -+ if (likely(!toi_bio_queue_flusher_should_finish)) -+ goto top; -+ toi_bio_queue_flusher_should_finish = 0; -+ } -+ -+ mutex_unlock(&busy); -+ return result; -+} -+ -+/** -+ * toi_bio_get_new_page - get a new page for I/O -+ * @full_buffer: Pointer to a page to allocate. -+ **/ -+static int toi_bio_get_new_page(char **full_buffer) -+{ -+ int result = throttle_if_needed(THROTTLE_WAIT); -+ if (result) -+ return result; -+ -+ while (!*full_buffer) { -+ *full_buffer = (char *) toi_get_zeroed_page(11, TOI_ATOMIC_GFP); -+ if (!*full_buffer) { -+ set_free_mem_throttle(); -+ do_bio_wait(3); -+ } -+ } -+ -+ return 0; -+} -+ -+/** -+ * toi_rw_buffer - combine smaller buffers into PAGE_SIZE I/O -+ * @writing: Bool - whether writing (or reading). -+ * @buffer: The start of the buffer to write or fill. -+ * @buffer_size: The size of the buffer to write or fill. -+ * @no_readahead: Don't try to start readhead (when getting extents). -+ **/ -+static int toi_rw_buffer(int writing, char *buffer, int buffer_size, -+ int no_readahead) -+{ -+ int bytes_left = buffer_size, result = 0; -+ -+ while (bytes_left) { -+ char *source_start = buffer + buffer_size - bytes_left; -+ char *dest_start = toi_writer_buffer + toi_writer_buffer_posn; -+ int capacity = PAGE_SIZE - toi_writer_buffer_posn; -+ char *to = writing ? dest_start : source_start; -+ char *from = writing ? source_start : dest_start; -+ -+ if (bytes_left <= capacity) { -+ memcpy(to, from, bytes_left); -+ toi_writer_buffer_posn += bytes_left; -+ return 0; -+ } -+ -+ /* Complete this page and start a new one */ -+ memcpy(to, from, capacity); -+ bytes_left -= capacity; -+ -+ if (!writing) { -+ /* -+ * Perform actual I/O: -+ * read readahead_list_head into toi_writer_buffer -+ */ -+ int result = toi_bio_get_next_page_read(no_readahead); -+ if (result) { -+ printk("toi_bio_get_next_page_read " -+ "returned %d.\n", result); -+ return result; -+ } -+ } else { -+ toi_bio_queue_write(&toi_writer_buffer); -+ result = toi_bio_get_new_page(&toi_writer_buffer); -+ if (result) { -+ printk(KERN_ERR "toi_bio_get_new_page returned " -+ "%d.\n", result); -+ return result; -+ } -+ } -+ -+ toi_writer_buffer_posn = 0; -+ toi_cond_pause(0, NULL); -+ } -+ -+ return 0; -+} -+ -+/** -+ * toi_bio_read_page - read a page of the image -+ * @pfn: The pfn where the data belongs. -+ * @buffer_page: The page containing the (possibly compressed) data. -+ * @buf_size: The number of bytes on @buffer_page used (PAGE_SIZE). -+ * -+ * Read a (possibly compressed) page from the image, into buffer_page, -+ * returning its pfn and the buffer size. -+ **/ -+static int toi_bio_read_page(unsigned long *pfn, struct page *buffer_page, -+ unsigned int *buf_size) -+{ -+ int result = 0; -+ int this_idx; -+ char *buffer_virt = kmap(buffer_page); -+ -+ /* -+ * Only call start_new_readahead if we don't have a dedicated thread -+ * and we're the queue flusher. -+ */ -+ if (current == toi_queue_flusher && more_readahead) { -+ int result2 = toi_start_new_readahead(0); -+ if (result2) { -+ printk(KERN_DEBUG "Queue flusher and " -+ "toi_start_one_readahead returned non-zero.\n"); -+ result = -EIO; -+ goto out; -+ } -+ } -+ -+ my_mutex_lock(0, &toi_bio_mutex); -+ -+ /* -+ * Structure in the image: -+ * [destination pfn|page size|page data] -+ * buf_size is PAGE_SIZE -+ */ -+ if (toi_rw_buffer(READ, (char *) &this_idx, sizeof(int), 0) || -+ toi_rw_buffer(READ, (char *) pfn, sizeof(unsigned long), 0) || -+ toi_rw_buffer(READ, (char *) buf_size, sizeof(int), 0) || -+ toi_rw_buffer(READ, buffer_virt, *buf_size, 0)) { -+ abort_hibernate(TOI_FAILED_IO, "Read of data failed."); -+ result = 1; -+ } -+ -+ if (reset_idx) { -+ page_idx = this_idx; -+ reset_idx = 0; -+ } else { -+ page_idx++; -+ if (page_idx != this_idx) -+ printk(KERN_ERR "Got page index %d, expected %d.\n", -+ this_idx, page_idx); -+ } -+ -+ my_mutex_unlock(0, &toi_bio_mutex); -+out: -+ kunmap(buffer_page); -+ return result; -+} -+ -+/** -+ * toi_bio_write_page - write a page of the image -+ * @pfn: The pfn where the data belongs. -+ * @buffer_page: The page containing the (possibly compressed) data. -+ * @buf_size: The number of bytes on @buffer_page used. -+ * -+ * Write a (possibly compressed) page to the image from the buffer, together -+ * with it's index and buffer size. -+ **/ -+static int toi_bio_write_page(unsigned long pfn, struct page *buffer_page, -+ unsigned int buf_size) -+{ -+ char *buffer_virt; -+ int result = 0, result2 = 0; -+ -+ if (unlikely(test_action_state(TOI_TEST_FILTER_SPEED))) -+ return 0; -+ -+ my_mutex_lock(1, &toi_bio_mutex); -+ -+ if (test_result_state(TOI_ABORTED)) { -+ my_mutex_unlock(1, &toi_bio_mutex); -+ return -EIO; -+ } -+ -+ buffer_virt = kmap(buffer_page); -+ page_idx++; -+ -+ /* -+ * Structure in the image: -+ * [destination pfn|page size|page data] -+ * buf_size is PAGE_SIZE -+ */ -+ if (toi_rw_buffer(WRITE, (char *) &page_idx, sizeof(int), 0) || -+ toi_rw_buffer(WRITE, (char *) &pfn, sizeof(unsigned long), 0) || -+ toi_rw_buffer(WRITE, (char *) &buf_size, sizeof(int), 0) || -+ toi_rw_buffer(WRITE, buffer_virt, buf_size, 0)) { -+ printk(KERN_DEBUG "toi_rw_buffer returned non-zero to " -+ "toi_bio_write_page.\n"); -+ result = -EIO; -+ } -+ -+ kunmap(buffer_page); -+ my_mutex_unlock(1, &toi_bio_mutex); -+ -+ if (current == toi_queue_flusher) -+ result2 = toi_bio_queue_flush_pages(0); -+ -+ return result ? result : result2; -+} -+ -+/** -+ * _toi_rw_header_chunk - read or write a portion of the image header -+ * @writing: Whether reading or writing. -+ * @owner: The module for which we're writing. -+ * Used for confirming that modules -+ * don't use more header space than they asked for. -+ * @buffer: Address of the data to write. -+ * @buffer_size: Size of the data buffer. -+ * @no_readahead: Don't try to start readhead (when getting extents). -+ * -+ * Perform PAGE_SIZE I/O. Start readahead if needed. -+ **/ -+static int _toi_rw_header_chunk(int writing, struct toi_module_ops *owner, -+ char *buffer, int buffer_size, int no_readahead) -+{ -+ int result = 0; -+ -+ if (owner) { -+ owner->header_used += buffer_size; -+ toi_message(TOI_HEADER, TOI_LOW, 1, -+ "Header: %s : %d bytes (%d/%d) from offset %d.", -+ owner->name, -+ buffer_size, owner->header_used, -+ owner->header_requested, -+ toi_writer_buffer_posn); -+ if (owner->header_used > owner->header_requested && writing) { -+ printk(KERN_EMERG "TuxOnIce module %s is using more " -+ "header space (%u) than it requested (%u).\n", -+ owner->name, -+ owner->header_used, -+ owner->header_requested); -+ return buffer_size; -+ } -+ } else { -+ unowned += buffer_size; -+ toi_message(TOI_HEADER, TOI_LOW, 1, -+ "Header: (No owner): %d bytes (%d total so far) from " -+ "offset %d.", buffer_size, unowned, -+ toi_writer_buffer_posn); -+ } -+ -+ if (!writing && !no_readahead && more_readahead) { -+ result = toi_start_new_readahead(0); -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Start new readahead " -+ "returned %d.", result); -+ } -+ -+ if (!result) { -+ result = toi_rw_buffer(writing, buffer, buffer_size, -+ no_readahead); -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "rw_buffer returned " -+ "%d.", result); -+ } -+ -+ total_header_bytes += buffer_size; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "_toi_rw_header_chunk returning " -+ "%d.", result); -+ return result; -+} -+ -+static int toi_rw_header_chunk(int writing, struct toi_module_ops *owner, -+ char *buffer, int size) -+{ -+ return _toi_rw_header_chunk(writing, owner, buffer, size, 1); -+} -+ -+static int toi_rw_header_chunk_noreadahead(int writing, -+ struct toi_module_ops *owner, char *buffer, int size) -+{ -+ return _toi_rw_header_chunk(writing, owner, buffer, size, 1); -+} -+ -+/** -+ * toi_bio_storage_needed - get the amount of storage needed for my fns -+ **/ -+static int toi_bio_storage_needed(void) -+{ -+ return sizeof(int) + PAGE_SIZE + toi_bio_devinfo_storage_needed(); -+} -+ -+/** -+ * toi_bio_save_config_info - save block I/O config to image header -+ * @buf: PAGE_SIZE'd buffer into which data should be saved. -+ **/ -+static int toi_bio_save_config_info(char *buf) -+{ -+ int *ints = (int *) buf; -+ ints[0] = target_outstanding_io; -+ return sizeof(int); -+} -+ -+/** -+ * toi_bio_load_config_info - restore block I/O config -+ * @buf: Data to be reloaded. -+ * @size: Size of the buffer saved. -+ **/ -+static void toi_bio_load_config_info(char *buf, int size) -+{ -+ int *ints = (int *) buf; -+ target_outstanding_io = ints[0]; -+} -+ -+void close_resume_dev_t(int force) -+{ -+ if (!resume_block_device) -+ return; -+ -+ if (force) -+ atomic_set(&resume_bdev_open_count, 0); -+ else -+ atomic_dec(&resume_bdev_open_count); -+ -+ if (!atomic_read(&resume_bdev_open_count)) { -+ toi_close_bdev(resume_block_device); -+ resume_block_device = NULL; -+ } -+} -+ -+int open_resume_dev_t(int force, int quiet) -+{ -+ if (force) { -+ close_resume_dev_t(1); -+ atomic_set(&resume_bdev_open_count, 1); -+ } else -+ atomic_inc(&resume_bdev_open_count); -+ -+ if (resume_block_device) -+ return 0; -+ -+ resume_block_device = toi_open_bdev(NULL, resume_dev_t, 0); -+ if (IS_ERR(resume_block_device)) { -+ if (!quiet) -+ toi_early_boot_message(1, TOI_CONTINUE_REQ, -+ "Failed to open device %x, where" -+ " the header should be found.", -+ resume_dev_t); -+ resume_block_device = NULL; -+ atomic_set(&resume_bdev_open_count, 0); -+ return 1; -+ } -+ -+ return 0; -+} -+ -+/** -+ * toi_bio_initialise - initialise bio code at start of some action -+ * @starting_cycle: Whether starting a hibernation cycle, or just reading or -+ * writing a sysfs value. -+ **/ -+static int toi_bio_initialise(int starting_cycle) -+{ -+ int result; -+ -+ if (!starting_cycle || !resume_dev_t) -+ return 0; -+ -+ max_outstanding_writes = 0; -+ max_outstanding_reads = 0; -+ current_stream = 0; -+ toi_queue_flusher = current; -+#ifdef MEASURE_MUTEX_CONTENTION -+ { -+ int i, j, k; -+ -+ for (i = 0; i < 2; i++) -+ for (j = 0; j < 2; j++) -+ for_each_online_cpu(k) -+ mutex_times[i][j][k] = 0; -+ } -+#endif -+ result = open_resume_dev_t(0, 1); -+ -+ if (result) -+ return result; -+ -+ return get_signature_page(); -+} -+ -+static unsigned long raw_to_real(unsigned long raw) -+{ -+ unsigned long result; -+ -+ result = raw - (raw * (sizeof(unsigned long) + sizeof(int)) + -+ (PAGE_SIZE + sizeof(unsigned long) + sizeof(int) + 1)) / -+ (PAGE_SIZE + sizeof(unsigned long) + sizeof(int)); -+ -+ return result < 0 ? 0 : result; -+} -+ -+static unsigned long toi_bio_storage_available(void) -+{ -+ unsigned long sum = 0; -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || -+ this_module->type != BIO_ALLOCATOR_MODULE) -+ continue; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Seeking storage " -+ "available from %s.", this_module->name); -+ sum += this_module->bio_allocator_ops->storage_available(); -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Total storage available is %lu " -+ "pages.", sum); -+ return raw_to_real(sum - header_pages_reserved); -+ -+} -+ -+static unsigned long toi_bio_storage_allocated(void) -+{ -+ return raw_pages_allocd > header_pages_reserved ? -+ raw_to_real(raw_pages_allocd - header_pages_reserved) : 0; -+} -+ -+/* -+ * If we have read part of the image, we might have filled memory with -+ * data that should be zeroed out. -+ */ -+static void toi_bio_noresume_reset(void) -+{ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_noresume_reset."); -+ toi_rw_cleanup(READ); -+ free_all_bdev_info(); -+} -+ -+/** -+ * toi_bio_cleanup - cleanup after some action -+ * @finishing_cycle: Whether completing a cycle. -+ **/ -+static void toi_bio_cleanup(int finishing_cycle) -+{ -+ if (!finishing_cycle) -+ return; -+ -+ if (toi_writer_buffer) { -+ toi_free_page(11, (unsigned long) toi_writer_buffer); -+ toi_writer_buffer = NULL; -+ } -+ -+ forget_signature_page(); -+ -+ if (header_block_device && toi_sig_data && -+ toi_sig_data->header_dev_t != resume_dev_t) -+ toi_close_bdev(header_block_device); -+ -+ header_block_device = NULL; -+ -+ close_resume_dev_t(0); -+} -+ -+static int toi_bio_write_header_init(void) -+{ -+ int result; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_write_header_init"); -+ toi_rw_init(WRITE, 0); -+ toi_writer_buffer_posn = 0; -+ -+ /* Info needed to bootstrap goes at the start of the header. -+ * First we save the positions and devinfo, including the number -+ * of header pages. Then we save the structs containing data needed -+ * for reading the header pages back. -+ * Note that even if header pages take more than one page, when we -+ * read back the info, we will have restored the location of the -+ * next header page by the time we go to use it. -+ */ -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "serialise extent chains."); -+ result = toi_serialise_extent_chains(); -+ -+ if (result) -+ return result; -+ -+ /* -+ * Signature page hasn't been modified at this point. Write it in -+ * the header so we can restore it later. -+ */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "serialise signature page."); -+ return toi_rw_header_chunk_noreadahead(WRITE, &toi_blockwriter_ops, -+ (char *) toi_cur_sig_page, -+ PAGE_SIZE); -+} -+ -+static int toi_bio_write_header_cleanup(void) -+{ -+ int result = 0; -+ -+ if (toi_writer_buffer_posn) -+ toi_bio_queue_write(&toi_writer_buffer); -+ -+ result = toi_finish_all_io(); -+ -+ unowned = 0; -+ total_header_bytes = 0; -+ -+ /* Set signature to save we have an image */ -+ if (!result) -+ result = toi_bio_mark_have_image(); -+ -+ return result; -+} -+ -+/* -+ * toi_bio_read_header_init() -+ * -+ * Description: -+ * 1. Attempt to read the device specified with resume=. -+ * 2. Check the contents of the swap header for our signature. -+ * 3. Warn, ignore, reset and/or continue as appropriate. -+ * 4. If continuing, read the toi_swap configuration section -+ * of the header and set up block device info so we can read -+ * the rest of the header & image. -+ * -+ * Returns: -+ * May not return if user choose to reboot at a warning. -+ * -EINVAL if cannot resume at this time. Booting should continue -+ * normally. -+ */ -+ -+static int toi_bio_read_header_init(void) -+{ -+ int result = 0; -+ char buf[32]; -+ -+ toi_writer_buffer_posn = 0; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_read_header_init"); -+ -+ if (!toi_sig_data) { -+ printk(KERN_INFO "toi_bio_read_header_init called when we " -+ "haven't verified there is an image!\n"); -+ return -EINVAL; -+ } -+ -+ /* -+ * If the header is not on the resume_swap_dev_t, get the resume device -+ * first. -+ */ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Header dev_t is %lx.", -+ toi_sig_data->header_dev_t); -+ if (toi_sig_data->have_uuid) { -+ dev_t device; -+ device = blk_lookup_uuid(toi_sig_data->header_uuid); -+ if (device) { -+ printk("Using dev_t %s, returned by blk_lookup_uuid.\n", -+ format_dev_t(buf, device)); -+ toi_sig_data->header_dev_t = device; -+ } -+ } -+ if (toi_sig_data->header_dev_t != resume_dev_t) { -+ header_block_device = toi_open_bdev(NULL, -+ toi_sig_data->header_dev_t, 1); -+ -+ if (IS_ERR(header_block_device)) -+ return PTR_ERR(header_block_device); -+ } else -+ header_block_device = resume_block_device; -+ -+ if (!toi_writer_buffer) -+ toi_writer_buffer = (char *) toi_get_zeroed_page(11, -+ TOI_ATOMIC_GFP); -+ more_readahead = 1; -+ -+ /* -+ * Read toi_swap configuration. -+ * Headerblock size taken into account already. -+ */ -+ result = toi_bio_ops.bdev_page_io(READ, header_block_device, -+ toi_sig_data->first_header_block, -+ virt_to_page((unsigned long) toi_writer_buffer)); -+ if (result) -+ return result; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "load extent chains."); -+ result = toi_load_extent_chains(); -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "load original signature page."); -+ toi_orig_sig_page = (char *) toi_get_zeroed_page(38, TOI_ATOMIC_GFP); -+ if (!toi_orig_sig_page) { -+ printk(KERN_ERR "Failed to allocate memory for the current" -+ " image signature.\n"); -+ return -ENOMEM; -+ } -+ -+ return toi_rw_header_chunk_noreadahead(READ, &toi_blockwriter_ops, -+ (char *) toi_orig_sig_page, -+ PAGE_SIZE); -+} -+ -+static int toi_bio_read_header_cleanup(void) -+{ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_read_header_cleanup."); -+ return toi_rw_cleanup(READ); -+} -+ -+/* Works only for digits and letters, but small and fast */ -+#define TOLOWER(x) ((x) | 0x20) -+ -+/* -+ * UUID must be 32 chars long. It may have dashes, but nothing -+ * else. -+ */ -+char *uuid_from_commandline(char *commandline) -+{ -+ int low = 0; -+ char *result = NULL, *output, *ptr; -+ -+ if (strncmp(commandline, "UUID=", 5)) -+ return NULL; -+ -+ result = kzalloc(17, GFP_KERNEL); -+ if (!result) { -+ printk("Failed to kzalloc UUID text memory.\n"); -+ return NULL; -+ } -+ -+ ptr = commandline + 5; -+ output = result; -+ -+ while (*ptr && (output - result) < 16) { -+ if (isxdigit(*ptr)) { -+ int value = isdigit(*ptr) ? *ptr - '0' : -+ TOLOWER(*ptr) - 'a' + 10; -+ if (low) { -+ *output += value; -+ output++; -+ } else { -+ *output = value << 4; -+ } -+ low = !low; -+ } else if (*ptr != '-') -+ break; -+ ptr++; -+ } -+ -+ if ((output - result) < 16 || *ptr) { -+ printk(KERN_DEBUG "Found resume=UUID=, but the value looks " -+ "invalid.\n"); -+ kfree(result); -+ result = NULL; -+ } -+ -+ return result; -+} -+ -+#define retry_if_fails(command) \ -+do { \ -+ command; \ -+ if (!resume_dev_t && !waited_for_device_probe) { \ -+ wait_for_device_probe(); \ -+ scsi_complete_async_scans(); \ -+ command; \ -+ waited_for_device_probe = 1; \ -+ } \ -+} while(0) -+ -+/** -+ * try_to_open_resume_device: Try to parse and open resume= -+ * -+ * Any "swap:" has been stripped away and we just have the path to deal with. -+ * We attempt to do name_to_dev_t, open and stat the file. Having opened the -+ * file, get the struct block_device * to match. -+ */ -+static int try_to_open_resume_device(char *commandline, int quiet) -+{ -+ struct kstat stat; -+ int error = 0; -+ char *uuid = uuid_from_commandline(commandline); -+ int waited_for_device_probe = 0; -+ -+ resume_dev_t = MKDEV(0, 0); -+ -+ if (!strlen(commandline)) -+ retry_if_fails(toi_bio_scan_for_image(quiet)); -+ -+ if (uuid) { -+ retry_if_fails(resume_dev_t = blk_lookup_uuid(uuid)); -+ kfree(uuid); -+ } -+ -+ if (!resume_dev_t) -+ retry_if_fails(resume_dev_t = name_to_dev_t(commandline)); -+ -+ if (!resume_dev_t) { -+ struct file *file = filp_open(commandline, -+ O_RDONLY|O_LARGEFILE, 0); -+ -+ if (!IS_ERR(file) && file) { -+ vfs_getattr(file->f_vfsmnt, file->f_dentry, &stat); -+ filp_close(file, NULL); -+ } else -+ error = vfs_stat(commandline, &stat); -+ if (!error) -+ resume_dev_t = stat.rdev; -+ } -+ -+ if (!resume_dev_t) { -+ if (quiet) -+ return 1; -+ -+ if (test_toi_state(TOI_TRYING_TO_RESUME)) -+ toi_early_boot_message(1, toi_translate_err_default, -+ "Failed to translate \"%s\" into a device id.\n", -+ commandline); -+ else -+ printk("TuxOnIce: Can't translate \"%s\" into a device " -+ "id yet.\n", commandline); -+ return 1; -+ } -+ -+ return open_resume_dev_t(1, quiet); -+} -+ -+/* -+ * Parse Image Location -+ * -+ * Attempt to parse a resume= parameter. -+ * Swap Writer accepts: -+ * resume=[swap:|file:]DEVNAME[:FIRSTBLOCK][@BLOCKSIZE] -+ * -+ * Where: -+ * DEVNAME is convertable to a dev_t by name_to_dev_t -+ * FIRSTBLOCK is the location of the first block in the swap file -+ * (specifying for a swap partition is nonsensical but not prohibited). -+ * Data is validated by attempting to read a swap header from the -+ * location given. Failure will result in toi_swap refusing to -+ * save an image, and a reboot with correct parameters will be -+ * necessary. -+ */ -+static int toi_bio_parse_sig_location(char *commandline, -+ int only_allocator, int quiet) -+{ -+ char *thischar, *devstart, *colon = NULL; -+ int signature_found, result = -EINVAL, temp_result = 0; -+ -+ if (strncmp(commandline, "swap:", 5) && -+ strncmp(commandline, "file:", 5)) { -+ /* -+ * Failing swap:, we'll take a simple resume=/dev/hda2, or a -+ * blank value (scan) but fall through to other allocators -+ * if /dev/ or UUID= isn't matched. -+ */ -+ if (strncmp(commandline, "/dev/", 5) && -+ strncmp(commandline, "UUID=", 5) && -+ strlen(commandline)) -+ return 1; -+ } else -+ commandline += 5; -+ -+ devstart = commandline; -+ thischar = commandline; -+ while ((*thischar != ':') && (*thischar != '@') && -+ ((thischar - commandline) < 250) && (*thischar)) -+ thischar++; -+ -+ if (*thischar == ':') { -+ colon = thischar; -+ *colon = 0; -+ thischar++; -+ } -+ -+ while ((thischar - commandline) < 250 && *thischar) -+ thischar++; -+ -+ if (colon) { -+ unsigned long block; -+ temp_result = strict_strtoul(colon + 1, 0, &block); -+ if (!temp_result) -+ resume_firstblock = (int) block; -+ } else -+ resume_firstblock = 0; -+ -+ clear_toi_state(TOI_CAN_HIBERNATE); -+ clear_toi_state(TOI_CAN_RESUME); -+ -+ if (!temp_result) -+ temp_result = try_to_open_resume_device(devstart, quiet); -+ -+ if (colon) -+ *colon = ':'; -+ -+ /* No error if we only scanned */ -+ if (temp_result) -+ return strlen(commandline) ? -EINVAL : 1; -+ -+ signature_found = toi_bio_image_exists(quiet); -+ -+ if (signature_found != -1) { -+ result = 0; -+ /* -+ * TODO: If only file storage, CAN_HIBERNATE should only be -+ * set if file allocator's target is valid. -+ */ -+ set_toi_state(TOI_CAN_HIBERNATE); -+ set_toi_state(TOI_CAN_RESUME); -+ } else -+ if (!quiet) -+ printk(KERN_ERR "TuxOnIce: Block I/O: No " -+ "signature found at %s.\n", devstart); -+ -+ close_resume_dev_t(0); -+ return result; -+} -+ -+static void toi_bio_release_storage(void) -+{ -+ header_pages_reserved = 0; -+ raw_pages_allocd = 0; -+ -+ free_all_bdev_info(); -+} -+ -+/* toi_swap_remove_image -+ * -+ */ -+static int toi_bio_remove_image(void) -+{ -+ int result; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_remove_image."); -+ -+ result = toi_bio_restore_original_signature(); -+ -+ /* -+ * We don't do a sanity check here: we want to restore the swap -+ * whatever version of kernel made the hibernate image. -+ * -+ * We need to write swap, but swap may not be enabled so -+ * we write the device directly -+ * -+ * If we don't have an current_signature_page, we didn't -+ * read an image header, so don't change anything. -+ */ -+ -+ toi_bio_release_storage(); -+ -+ return result; -+} -+ -+struct toi_bio_ops toi_bio_ops = { -+ .bdev_page_io = toi_bdev_page_io, -+ .register_storage = toi_register_storage_chain, -+ .free_storage = toi_bio_release_storage, -+}; -+EXPORT_SYMBOL_GPL(toi_bio_ops); -+ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_INT("target_outstanding_io", SYSFS_RW, &target_outstanding_io, -+ 0, 16384, 0, NULL), -+}; -+ -+struct toi_module_ops toi_blockwriter_ops = { -+ .type = WRITER_MODULE, -+ .name = "block i/o", -+ .directory = "block_io", -+ .module = THIS_MODULE, -+ .memory_needed = toi_bio_memory_needed, -+ .print_debug_info = toi_bio_print_debug_stats, -+ .storage_needed = toi_bio_storage_needed, -+ .save_config_info = toi_bio_save_config_info, -+ .load_config_info = toi_bio_load_config_info, -+ .initialise = toi_bio_initialise, -+ .cleanup = toi_bio_cleanup, -+ .post_atomic_restore = toi_bio_chains_post_atomic, -+ -+ .rw_init = toi_rw_init, -+ .rw_cleanup = toi_rw_cleanup, -+ .read_page = toi_bio_read_page, -+ .write_page = toi_bio_write_page, -+ .rw_header_chunk = toi_rw_header_chunk, -+ .rw_header_chunk_noreadahead = toi_rw_header_chunk_noreadahead, -+ .io_flusher = bio_io_flusher, -+ .update_throughput_throttle = update_throughput_throttle, -+ .finish_all_io = toi_finish_all_io, -+ -+ .noresume_reset = toi_bio_noresume_reset, -+ .storage_available = toi_bio_storage_available, -+ .storage_allocated = toi_bio_storage_allocated, -+ .reserve_header_space = toi_bio_reserve_header_space, -+ .allocate_storage = toi_bio_allocate_storage, -+ .image_exists = toi_bio_image_exists, -+ .mark_resume_attempted = toi_bio_mark_resume_attempted, -+ .write_header_init = toi_bio_write_header_init, -+ .write_header_cleanup = toi_bio_write_header_cleanup, -+ .read_header_init = toi_bio_read_header_init, -+ .read_header_cleanup = toi_bio_read_header_cleanup, -+ .get_header_version = toi_bio_get_header_version, -+ .remove_image = toi_bio_remove_image, -+ .parse_sig_location = toi_bio_parse_sig_location, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+/** -+ * toi_block_io_load - load time routine for block I/O module -+ * -+ * Register block i/o ops and sysfs entries. -+ **/ -+static __init int toi_block_io_load(void) -+{ -+ return toi_register_module(&toi_blockwriter_ops); -+} -+ -+#ifdef MODULE -+static __exit void toi_block_io_unload(void) -+{ -+ toi_unregister_module(&toi_blockwriter_ops); -+} -+ -+module_init(toi_block_io_load); -+module_exit(toi_block_io_unload); -+MODULE_LICENSE("GPL"); -+MODULE_AUTHOR("Nigel Cunningham"); -+MODULE_DESCRIPTION("TuxOnIce block io functions"); -+#else -+late_initcall(toi_block_io_load); -+#endif -diff --git a/kernel/power/tuxonice_bio_internal.h b/kernel/power/tuxonice_bio_internal.h -new file mode 100644 -index 0000000..58c2481 ---- /dev/null -+++ b/kernel/power/tuxonice_bio_internal.h -@@ -0,0 +1,86 @@ -+/* -+ * kernel/power/tuxonice_bio_internal.h -+ * -+ * Copyright (C) 2009-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * This file contains declarations for functions exported from -+ * tuxonice_bio.c, which contains low level io functions. -+ */ -+ -+/* Extent chains */ -+void toi_extent_state_goto_start(void); -+void toi_extent_state_save(int slot); -+int go_next_page(int writing, int section_barrier); -+void toi_extent_state_restore(int slot); -+void free_all_bdev_info(void); -+int devices_of_same_priority(struct toi_bdev_info *this); -+int toi_register_storage_chain(struct toi_bdev_info *new); -+int toi_serialise_extent_chains(void); -+int toi_load_extent_chains(void); -+int toi_bio_rw_page(int writing, struct page *page, int is_readahead, -+ int free_group); -+int toi_bio_restore_original_signature(void); -+int toi_bio_devinfo_storage_needed(void); -+unsigned long get_headerblock(void); -+dev_t get_header_dev_t(void); -+struct block_device *get_header_bdev(void); -+int toi_bio_allocate_storage(unsigned long request); -+ -+/* Signature functions */ -+#define HaveImage "HaveImage" -+#define NoImage "TuxOnIce" -+#define sig_size (sizeof(HaveImage)) -+ -+struct sig_data { -+ char sig[sig_size]; -+ int have_image; -+ int resumed_before; -+ -+ char have_uuid; -+ char header_uuid[17]; -+ dev_t header_dev_t; -+ unsigned long first_header_block; -+ -+ /* Repeat the signature to be sure we have a header version */ -+ char sig2[sig_size]; -+ int header_version; -+}; -+ -+void forget_signature_page(void); -+int toi_check_for_signature(void); -+int toi_bio_image_exists(int quiet); -+int get_signature_page(void); -+int toi_bio_mark_resume_attempted(int); -+extern char *toi_cur_sig_page; -+extern char *toi_orig_sig_page; -+int toi_bio_mark_have_image(void); -+extern struct sig_data *toi_sig_data; -+extern dev_t resume_dev_t; -+extern struct block_device *resume_block_device; -+extern struct block_device *header_block_device; -+extern unsigned long resume_firstblock; -+ -+struct block_device *open_bdev(dev_t device, int display_errs); -+extern int current_stream; -+extern int more_readahead; -+int toi_do_io(int writing, struct block_device *bdev, long block0, -+ struct page *page, int is_readahead, int syncio, int free_group); -+int get_main_pool_phys_params(void); -+ -+void toi_close_bdev(struct block_device *bdev); -+struct block_device *toi_open_bdev(char *uuid, dev_t default_device, -+ int display_errs); -+ -+extern struct toi_module_ops toi_blockwriter_ops; -+void dump_block_chains(void); -+void debug_broken_header(void); -+extern unsigned long raw_pages_allocd, header_pages_reserved; -+int toi_bio_chains_debug_info(char *buffer, int size); -+void toi_bio_chains_post_atomic(struct toi_boot_kernel_data *bkd); -+int toi_bio_scan_for_image(int quiet); -+int toi_bio_get_header_version(void); -+ -+void close_resume_dev_t(int force); -+int open_resume_dev_t(int force, int quiet); -diff --git a/kernel/power/tuxonice_bio_signature.c b/kernel/power/tuxonice_bio_signature.c -new file mode 100644 -index 0000000..e6f6cc8 ---- /dev/null -+++ b/kernel/power/tuxonice_bio_signature.c -@@ -0,0 +1,410 @@ -+/* -+ * kernel/power/tuxonice_bio_signature.c -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ */ -+ -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice_bio.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_io.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_bio_internal.h" -+ -+struct sig_data *toi_sig_data; -+ -+/* Struct of swap header pages */ -+ -+struct old_sig_data { -+ dev_t device; -+ unsigned long sector; -+ int resume_attempted; -+ int orig_sig_type; -+}; -+ -+union diskpage { -+ union swap_header swh; /* swh.magic is the only member used */ -+ struct sig_data sig_data; -+ struct old_sig_data old_sig_data; -+}; -+ -+union p_diskpage { -+ union diskpage *pointer; -+ char *ptr; -+ unsigned long address; -+}; -+ -+char *toi_cur_sig_page; -+char *toi_orig_sig_page; -+int have_image; -+int have_old_image; -+ -+int get_signature_page(void) -+{ -+ if (!toi_cur_sig_page) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Allocating current signature page."); -+ toi_cur_sig_page = (char *) toi_get_zeroed_page(38, -+ TOI_ATOMIC_GFP); -+ if (!toi_cur_sig_page) { -+ printk(KERN_ERR "Failed to allocate memory for the " -+ "current image signature.\n"); -+ return -ENOMEM; -+ } -+ -+ toi_sig_data = (struct sig_data *) toi_cur_sig_page; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Reading signature from dev %lx," -+ " sector %d.", -+ resume_block_device->bd_dev, resume_firstblock); -+ -+ return toi_bio_ops.bdev_page_io(READ, resume_block_device, -+ resume_firstblock, virt_to_page(toi_cur_sig_page)); -+} -+ -+void forget_signature_page(void) -+{ -+ if (toi_cur_sig_page) { -+ toi_sig_data = NULL; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Freeing toi_cur_sig_page" -+ " (%p).", toi_cur_sig_page); -+ toi_free_page(38, (unsigned long) toi_cur_sig_page); -+ toi_cur_sig_page = NULL; -+ } -+ -+ if (toi_orig_sig_page) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Freeing toi_orig_sig_page" -+ " (%p).", toi_orig_sig_page); -+ toi_free_page(38, (unsigned long) toi_orig_sig_page); -+ toi_orig_sig_page = NULL; -+ } -+} -+ -+/* -+ * We need to ensure we use the signature page that's currently on disk, -+ * so as to not remove the image header. Post-atomic-restore, the orig sig -+ * page will be empty, so we can use that as our method of knowing that we -+ * need to load the on-disk signature and not use the non-image sig in -+ * memory. (We're going to powerdown after writing the change, so it's safe. -+ */ -+int toi_bio_mark_resume_attempted(int flag) -+{ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Make resume attempted = %d.", -+ flag); -+ if (!toi_orig_sig_page) { -+ forget_signature_page(); -+ get_signature_page(); -+ } -+ toi_sig_data->resumed_before = flag; -+ return toi_bio_ops.bdev_page_io(WRITE, resume_block_device, -+ resume_firstblock, virt_to_page(toi_cur_sig_page)); -+} -+ -+int toi_bio_mark_have_image(void) -+{ -+ int result = 0; -+ char buf[32]; -+ struct fs_info *fs_info; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Recording that an image exists."); -+ memcpy(toi_sig_data->sig, tuxonice_signature, -+ sizeof(tuxonice_signature)); -+ toi_sig_data->have_image = 1; -+ toi_sig_data->resumed_before = 0; -+ toi_sig_data->header_dev_t = get_header_dev_t(); -+ toi_sig_data->have_uuid = 0; -+ -+ fs_info = fs_info_from_block_dev(get_header_bdev()); -+ if (fs_info && !IS_ERR(fs_info)) { -+ memcpy(toi_sig_data->header_uuid, &fs_info->uuid, 16); -+ free_fs_info(fs_info); -+ } else -+ result = (int) PTR_ERR(fs_info); -+ -+ if (!result) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Got uuid for dev_t %s.", -+ format_dev_t(buf, get_header_dev_t())); -+ toi_sig_data->have_uuid = 1; -+ } else -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Could not get uuid for " -+ "dev_t %s.", -+ format_dev_t(buf, get_header_dev_t())); -+ -+ toi_sig_data->first_header_block = get_headerblock(); -+ have_image = 1; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "header dev_t is %x. First block " -+ "is %d.", toi_sig_data->header_dev_t, -+ toi_sig_data->first_header_block); -+ -+ memcpy(toi_sig_data->sig2, tuxonice_signature, -+ sizeof(tuxonice_signature)); -+ toi_sig_data->header_version = TOI_HEADER_VERSION; -+ -+ return toi_bio_ops.bdev_page_io(WRITE, resume_block_device, -+ resume_firstblock, virt_to_page(toi_cur_sig_page)); -+} -+ -+int remove_old_signature(void) -+{ -+ union p_diskpage swap_header_page = (union p_diskpage) toi_cur_sig_page; -+ char *orig_sig, *no_image_signature_contents; -+ char *header_start = (char *) toi_get_zeroed_page(38, TOI_ATOMIC_GFP); -+ int result; -+ struct block_device *header_bdev; -+ struct old_sig_data *old_sig_data = -+ &swap_header_page.pointer->old_sig_data; -+ -+ header_bdev = toi_open_bdev(NULL, old_sig_data->device, 1); -+ result = toi_bio_ops.bdev_page_io(READ, header_bdev, -+ old_sig_data->sector, virt_to_page(header_start)); -+ -+ if (result) -+ goto out; -+ -+ /* -+ * TODO: Get the original contents of the first bytes of the swap -+ * header page. -+ */ -+ if (!old_sig_data->orig_sig_type) -+ orig_sig = "SWAP-SPACE"; -+ else -+ orig_sig = "SWAPSPACE2"; -+ -+ memcpy(swap_header_page.pointer->swh.magic.magic, orig_sig, 10); -+ memcpy(swap_header_page.ptr, header_start, -+ sizeof(no_image_signature_contents)); -+ -+ result = toi_bio_ops.bdev_page_io(WRITE, resume_block_device, -+ resume_firstblock, virt_to_page(swap_header_page.ptr)); -+ -+out: -+ toi_close_bdev(header_bdev); -+ have_old_image = 0; -+ toi_free_page(38, (unsigned long) header_start); -+ return result; -+} -+ -+/* -+ * toi_bio_restore_original_signature - restore the original signature -+ * -+ * At boot time (aborting pre atomic-restore), toi_orig_sig_page gets used. -+ * It will have the original signature page contents, stored in the image -+ * header. Post atomic-restore, we use :toi_cur_sig_page, which will contain -+ * the contents that were loaded when we started the cycle. -+ */ -+int toi_bio_restore_original_signature(void) -+{ -+ char *use = toi_orig_sig_page ? toi_orig_sig_page : toi_cur_sig_page; -+ -+ if (have_old_image) -+ return remove_old_signature(); -+ -+ if (!use) { -+ printk("toi_bio_restore_original_signature: No signature " -+ "page loaded.\n"); -+ return 0; -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Recording that no image exists."); -+ have_image = 0; -+ toi_sig_data->have_image = 0; -+ return toi_bio_ops.bdev_page_io(WRITE, resume_block_device, -+ resume_firstblock, virt_to_page(use)); -+} -+ -+/* -+ * check_for_signature - See whether we have an image. -+ * -+ * Returns 0 if no image, 1 if there is one, -1 if indeterminate. -+ */ -+int toi_check_for_signature(void) -+{ -+ union p_diskpage swap_header_page; -+ int type; -+ const char *normal_sigs[] = {"SWAP-SPACE", "SWAPSPACE2" }; -+ const char *swsusp_sigs[] = {"S1SUSP", "S2SUSP", "S1SUSPEND" }; -+ char *swap_header; -+ -+ if (!toi_cur_sig_page) { -+ int result = get_signature_page(); -+ -+ if (result) -+ return result; -+ } -+ -+ /* -+ * Start by looking for the binary header. -+ */ -+ if (!memcmp(tuxonice_signature, toi_cur_sig_page, -+ sizeof(tuxonice_signature))) { -+ have_image = toi_sig_data->have_image; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Have binary signature. " -+ "Have image is %d.", have_image); -+ if (have_image) -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "header dev_t is " -+ "%x. First block is %d.", -+ toi_sig_data->header_dev_t, -+ toi_sig_data->first_header_block); -+ return toi_sig_data->have_image; -+ } -+ -+ /* -+ * Failing that, try old file allocator headers. -+ */ -+ -+ if (!memcmp(HaveImage, toi_cur_sig_page, strlen(HaveImage))) { -+ have_image = 1; -+ return 1; -+ } -+ -+ have_image = 0; -+ -+ if (!memcmp(NoImage, toi_cur_sig_page, strlen(NoImage))) -+ return 0; -+ -+ /* -+ * Nope? How about swap? -+ */ -+ swap_header_page = (union p_diskpage) toi_cur_sig_page; -+ swap_header = swap_header_page.pointer->swh.magic.magic; -+ -+ /* Normal swapspace? */ -+ for (type = 0; type < 2; type++) -+ if (!memcmp(normal_sigs[type], swap_header, -+ strlen(normal_sigs[type]))) -+ return 0; -+ -+ /* Swsusp or uswsusp? */ -+ for (type = 0; type < 3; type++) -+ if (!memcmp(swsusp_sigs[type], swap_header, -+ strlen(swsusp_sigs[type]))) -+ return 2; -+ -+ /* Old TuxOnIce version? */ -+ if (!memcmp(tuxonice_signature, swap_header, -+ sizeof(tuxonice_signature) - 1)) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Found old TuxOnIce " -+ "signature."); -+ have_old_image = 1; -+ return 3; -+ } -+ -+ return -1; -+} -+ -+/* -+ * Image_exists -+ * -+ * Returns -1 if don't know, otherwise 0 (no) or 1 (yes). -+ */ -+int toi_bio_image_exists(int quiet) -+{ -+ int result; -+ char *orig_sig_page = toi_cur_sig_page; -+ char *msg = NULL; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_bio_image_exists."); -+ -+ if (!resume_dev_t) { -+ if (!quiet) -+ printk(KERN_INFO "Not even trying to read header " -+ "because resume_dev_t is not set.\n"); -+ return -1; -+ } -+ -+ if (open_resume_dev_t(0, quiet)) -+ return -1; -+ -+ result = toi_check_for_signature(); -+ -+ clear_toi_state(TOI_RESUMED_BEFORE); -+ if (toi_sig_data->resumed_before) -+ set_toi_state(TOI_RESUMED_BEFORE); -+ -+ if (quiet || result == -ENOMEM) -+ goto out; -+ -+ if (result == -1) -+ msg = "TuxOnIce: Unable to find a signature." -+ " Could you have moved a swap file?\n"; -+ else if (!result) -+ msg = "TuxOnIce: No image found.\n"; -+ else if (result == 1) -+ msg = "TuxOnIce: Image found.\n"; -+ else if (result == 2) -+ msg = "TuxOnIce: uswsusp or swsusp image found.\n"; -+ else if (result == 3) -+ msg = "TuxOnIce: Old implementation's signature found.\n"; -+ -+ printk(KERN_INFO "%s", msg); -+ -+out: -+ if (!orig_sig_page) -+ forget_signature_page(); -+ -+ close_resume_dev_t(0); -+ return result; -+} -+ -+int toi_bio_scan_for_image(int quiet) -+{ -+ struct block_device *bdev; -+ char default_name[255] = ""; -+ -+ if (!quiet) -+ printk(KERN_DEBUG "Scanning swap devices for TuxOnIce " -+ "signature...\n"); -+ for (bdev = next_bdev_of_type(NULL, "swap"); bdev; -+ bdev = next_bdev_of_type(bdev, "swap")) { -+ int result; -+ char name[255] = ""; -+ sprintf(name, "%u:%u", MAJOR(bdev->bd_dev), -+ MINOR(bdev->bd_dev)); -+ if (!quiet) -+ printk(KERN_DEBUG "- Trying %s.\n", name); -+ resume_block_device = bdev; -+ resume_dev_t = bdev->bd_dev; -+ -+ result = toi_check_for_signature(); -+ -+ resume_block_device = NULL; -+ resume_dev_t = MKDEV(0, 0); -+ -+ if (!default_name[0]) -+ strcpy(default_name, name); -+ -+ if (result == 1) { -+ /* Got one! */ -+ strcpy(resume_file, name); -+ next_bdev_of_type(bdev, NULL); -+ if (!quiet) -+ printk(KERN_DEBUG " ==> Image found on %s.\n", -+ resume_file); -+ return 1; -+ } -+ forget_signature_page(); -+ } -+ -+ if (!quiet) -+ printk(KERN_DEBUG "TuxOnIce scan: No image found.\n"); -+ strcpy(resume_file, default_name); -+ return 0; -+} -+ -+int toi_bio_get_header_version(void) -+{ -+ return (memcmp(toi_sig_data->sig2, tuxonice_signature, -+ sizeof(tuxonice_signature))) ? -+ 0 : toi_sig_data->header_version; -+ -+} -diff --git a/kernel/power/tuxonice_builtin.c b/kernel/power/tuxonice_builtin.c -new file mode 100644 -index 0000000..d9704f2 ---- /dev/null -+++ b/kernel/power/tuxonice_builtin.c -@@ -0,0 +1,360 @@ -+/* -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ */ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include "tuxonice_io.h" -+#include "tuxonice.h" -+#include "tuxonice_extent.h" -+#include "tuxonice_netlink.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_pagedir.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_power_off.h" -+ -+/* -+ * Highmem related functions (x86 only). -+ */ -+ -+#ifdef CONFIG_HIGHMEM -+ -+/** -+ * copyback_high: Restore highmem pages. -+ * -+ * Highmem data and pbe lists are/can be stored in highmem. -+ * The format is slightly different to the lowmem pbe lists -+ * used for the assembly code: the last pbe in each page is -+ * a struct page * instead of struct pbe *, pointing to the -+ * next page where pbes are stored (or NULL if happens to be -+ * the end of the list). Since we don't want to generate -+ * unnecessary deltas against swsusp code, we use a cast -+ * instead of a union. -+ **/ -+ -+static void copyback_high(void) -+{ -+ struct page *pbe_page = (struct page *) restore_highmem_pblist; -+ struct pbe *this_pbe, *first_pbe; -+ unsigned long *origpage, *copypage; -+ int pbe_index = 1; -+ -+ if (!pbe_page) -+ return; -+ -+ this_pbe = (struct pbe *) kmap_atomic(pbe_page, KM_BOUNCE_READ); -+ first_pbe = this_pbe; -+ -+ while (this_pbe) { -+ int loop = (PAGE_SIZE / sizeof(unsigned long)) - 1; -+ -+ origpage = kmap_atomic((struct page *) this_pbe->orig_address, -+ KM_BIO_DST_IRQ); -+ copypage = kmap_atomic((struct page *) this_pbe->address, -+ KM_BIO_SRC_IRQ); -+ -+ while (loop >= 0) { -+ *(origpage + loop) = *(copypage + loop); -+ loop--; -+ } -+ -+ kunmap_atomic(origpage, KM_BIO_DST_IRQ); -+ kunmap_atomic(copypage, KM_BIO_SRC_IRQ); -+ -+ if (!this_pbe->next) -+ break; -+ -+ if (pbe_index < PBES_PER_PAGE) { -+ this_pbe++; -+ pbe_index++; -+ } else { -+ pbe_page = (struct page *) this_pbe->next; -+ kunmap_atomic(first_pbe, KM_BOUNCE_READ); -+ if (!pbe_page) -+ return; -+ this_pbe = (struct pbe *) kmap_atomic(pbe_page, -+ KM_BOUNCE_READ); -+ first_pbe = this_pbe; -+ pbe_index = 1; -+ } -+ } -+ kunmap_atomic(first_pbe, KM_BOUNCE_READ); -+} -+ -+#else /* CONFIG_HIGHMEM */ -+static void copyback_high(void) { } -+#endif -+ -+char toi_wait_for_keypress_dev_console(int timeout) -+{ -+ int fd, this_timeout = 255; -+ char key = '\0'; -+ struct termios t, t_backup; -+ -+ /* We should be guaranteed /dev/console exists after populate_rootfs() -+ * in init/main.c. -+ */ -+ fd = sys_open("/dev/console", O_RDONLY, 0); -+ if (fd < 0) { -+ printk(KERN_INFO "Couldn't open /dev/console.\n"); -+ return key; -+ } -+ -+ if (sys_ioctl(fd, TCGETS, (long)&t) < 0) -+ goto out_close; -+ -+ memcpy(&t_backup, &t, sizeof(t)); -+ -+ t.c_lflag &= ~(ISIG|ICANON|ECHO); -+ t.c_cc[VMIN] = 0; -+ -+new_timeout: -+ if (timeout > 0) { -+ this_timeout = timeout < 26 ? timeout : 25; -+ timeout -= this_timeout; -+ this_timeout *= 10; -+ } -+ -+ t.c_cc[VTIME] = this_timeout; -+ -+ if (sys_ioctl(fd, TCSETS, (long)&t) < 0) -+ goto out_restore; -+ -+ while (1) { -+ if (sys_read(fd, &key, 1) <= 0) { -+ if (timeout) -+ goto new_timeout; -+ key = '\0'; -+ break; -+ } -+ key = tolower(key); -+ if (test_toi_state(TOI_SANITY_CHECK_PROMPT)) { -+ if (key == 'c') { -+ set_toi_state(TOI_CONTINUE_REQ); -+ break; -+ } else if (key == ' ') -+ break; -+ } else -+ break; -+ } -+ -+out_restore: -+ sys_ioctl(fd, TCSETS, (long)&t_backup); -+out_close: -+ sys_close(fd); -+ -+ return key; -+} -+EXPORT_SYMBOL_GPL(toi_wait_for_keypress_dev_console); -+ -+struct toi_boot_kernel_data toi_bkd __nosavedata -+ __attribute__((aligned(PAGE_SIZE))) = { -+ MY_BOOT_KERNEL_DATA_VERSION, -+ 0, -+#ifdef CONFIG_TOI_REPLACE_SWSUSP -+ (1 << TOI_REPLACE_SWSUSP) | -+#endif -+ (1 << TOI_NO_FLUSHER_THREAD) | -+ (1 << TOI_PAGESET2_FULL) | (1 << TOI_LATE_CPU_HOTPLUG), -+}; -+EXPORT_SYMBOL_GPL(toi_bkd); -+ -+struct block_device *toi_open_by_devnum(dev_t dev) -+{ -+ struct block_device *bdev = bdget(dev); -+ int err = -ENOMEM; -+ if (bdev) -+ err = blkdev_get(bdev, FMODE_READ | FMODE_NDELAY); -+ return err ? ERR_PTR(err) : bdev; -+} -+EXPORT_SYMBOL_GPL(toi_open_by_devnum); -+ -+/** -+ * toi_close_bdev: Close a swap bdev. -+ * -+ * int: The swap entry number to close. -+ */ -+void toi_close_bdev(struct block_device *bdev) -+{ -+ blkdev_put(bdev, FMODE_READ | FMODE_NDELAY); -+} -+EXPORT_SYMBOL_GPL(toi_close_bdev); -+ -+int toi_wait = CONFIG_TOI_DEFAULT_WAIT; -+EXPORT_SYMBOL_GPL(toi_wait); -+ -+struct toi_core_fns *toi_core_fns; -+EXPORT_SYMBOL_GPL(toi_core_fns); -+ -+unsigned long toi_result; -+EXPORT_SYMBOL_GPL(toi_result); -+ -+struct pagedir pagedir1 = {1}; -+EXPORT_SYMBOL_GPL(pagedir1); -+ -+unsigned long toi_get_nonconflicting_page(void) -+{ -+ return toi_core_fns->get_nonconflicting_page(); -+} -+ -+int toi_post_context_save(void) -+{ -+ return toi_core_fns->post_context_save(); -+} -+ -+int try_tuxonice_hibernate(void) -+{ -+ if (!toi_core_fns) -+ return -ENODEV; -+ -+ return toi_core_fns->try_hibernate(); -+} -+ -+static int num_resume_calls; -+#ifdef CONFIG_TOI_IGNORE_LATE_INITCALL -+static int ignore_late_initcall = 1; -+#else -+static int ignore_late_initcall; -+#endif -+ -+int toi_translate_err_default = TOI_CONTINUE_REQ; -+EXPORT_SYMBOL_GPL(toi_translate_err_default); -+ -+void try_tuxonice_resume(void) -+{ -+ /* Don't let it wrap around eventually */ -+ if (num_resume_calls < 2) -+ num_resume_calls++; -+ -+ if (num_resume_calls == 1 && ignore_late_initcall) { -+ printk(KERN_INFO "TuxOnIce: Ignoring late initcall, as requested.\n"); -+ return; -+ } -+ -+ if (toi_core_fns) -+ toi_core_fns->try_resume(); -+ else -+ printk(KERN_INFO "TuxOnIce core not loaded yet.\n"); -+} -+ -+int toi_lowlevel_builtin(void) -+{ -+ int error = 0; -+ -+ save_processor_state(); -+ error = swsusp_arch_suspend(); -+ if (error) -+ printk(KERN_ERR "Error %d hibernating\n", error); -+ -+ /* Restore control flow appears here */ -+ if (!toi_in_hibernate) { -+ copyback_high(); -+ set_toi_state(TOI_NOW_RESUMING); -+ } -+ -+ restore_processor_state(); -+ -+ return error; -+} -+EXPORT_SYMBOL_GPL(toi_lowlevel_builtin); -+ -+unsigned long toi_compress_bytes_in; -+EXPORT_SYMBOL_GPL(toi_compress_bytes_in); -+ -+unsigned long toi_compress_bytes_out; -+EXPORT_SYMBOL_GPL(toi_compress_bytes_out); -+ -+unsigned long toi_state = ((1 << TOI_BOOT_TIME) | -+ (1 << TOI_IGNORE_LOGLEVEL) | -+ (1 << TOI_IO_STOPPED)); -+EXPORT_SYMBOL_GPL(toi_state); -+ -+/* The number of hibernates we have started (some may have been cancelled) */ -+unsigned int nr_hibernates; -+EXPORT_SYMBOL_GPL(nr_hibernates); -+ -+int toi_running; -+EXPORT_SYMBOL_GPL(toi_running); -+ -+__nosavedata int toi_in_hibernate; -+EXPORT_SYMBOL_GPL(toi_in_hibernate); -+ -+__nosavedata struct pbe *restore_highmem_pblist; -+EXPORT_SYMBOL_GPL(restore_highmem_pblist); -+ -+static int __init toi_wait_setup(char *str) -+{ -+ int value; -+ -+ if (sscanf(str, "=%d", &value)) { -+ if (value < -1 || value > 255) -+ printk(KERN_INFO "TuxOnIce_wait outside range -1 to " -+ "255.\n"); -+ else -+ toi_wait = value; -+ } -+ -+ return 1; -+} -+ -+__setup("toi_wait", toi_wait_setup); -+ -+static int __init toi_translate_retry_setup(char *str) -+{ -+ toi_translate_err_default = 0; -+ return 1; -+} -+ -+__setup("toi_translate_retry", toi_translate_retry_setup); -+ -+static int __init toi_debug_setup(char *str) -+{ -+ toi_bkd.toi_action |= (1 << TOI_LOGALL) | (1 << TOI_PAUSE); -+ toi_bkd.toi_debug_state = 255; -+ toi_bkd.toi_default_console_level = 7; -+ return 1; -+} -+ -+__setup("toi_debug_setup", toi_debug_setup); -+ -+static int __init toi_ignore_late_initcall_setup(char *str) -+{ -+ int value; -+ -+ if (sscanf(str, "=%d", &value)) -+ ignore_late_initcall = value; -+ -+ return 1; -+} -+ -+__setup("toi_initramfs_resume_only", toi_ignore_late_initcall_setup); -+ -+int toi_force_no_multithreaded; -+EXPORT_SYMBOL_GPL(toi_force_no_multithreaded); -+ -+static int __init toi_force_no_multithreaded_setup(char *str) -+{ -+ int value; -+ -+ if (sscanf(str, "=%d", &value)) -+ toi_force_no_multithreaded = value; -+ -+ return 1; -+} -+ -+__setup("toi_no_multithreaded", toi_force_no_multithreaded_setup); -diff --git a/kernel/power/tuxonice_builtin.h b/kernel/power/tuxonice_builtin.h -new file mode 100644 -index 0000000..56ede35 ---- /dev/null -+++ b/kernel/power/tuxonice_builtin.h -@@ -0,0 +1,30 @@ -+/* -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ */ -+#include -+ -+extern struct toi_core_fns *toi_core_fns; -+extern unsigned long toi_compress_bytes_in, toi_compress_bytes_out; -+extern unsigned int nr_hibernates; -+extern int toi_in_hibernate; -+ -+extern __nosavedata struct pbe *restore_highmem_pblist; -+ -+int toi_lowlevel_builtin(void); -+ -+#ifdef CONFIG_HIGHMEM -+extern __nosavedata struct zone_data *toi_nosave_zone_list; -+extern __nosavedata unsigned long toi_nosave_max_pfn; -+#endif -+ -+extern unsigned long toi_get_nonconflicting_page(void); -+extern int toi_post_context_save(void); -+ -+extern char toi_wait_for_keypress_dev_console(int timeout); -+extern struct block_device *toi_open_by_devnum(dev_t dev); -+extern void toi_close_bdev(struct block_device *bdev); -+extern int toi_wait; -+extern int toi_translate_err_default; -+extern int toi_force_no_multithreaded; -diff --git a/kernel/power/tuxonice_checksum.c b/kernel/power/tuxonice_checksum.c -new file mode 100644 -index 0000000..3ec2c76 ---- /dev/null -+++ b/kernel/power/tuxonice_checksum.c -@@ -0,0 +1,377 @@ -+/* -+ * kernel/power/tuxonice_checksum.c -+ * -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * This file contains data checksum routines for TuxOnIce, -+ * using cryptoapi. They are used to locate any modifications -+ * made to pageset 2 while we're saving it. -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_io.h" -+#include "tuxonice_pageflags.h" -+#include "tuxonice_checksum.h" -+#include "tuxonice_pagedir.h" -+#include "tuxonice_alloc.h" -+ -+static struct toi_module_ops toi_checksum_ops; -+ -+/* Constant at the mo, but I might allow tuning later */ -+static char toi_checksum_name[32] = "md4"; -+/* Bytes per checksum */ -+#define CHECKSUM_SIZE (16) -+ -+#define CHECKSUMS_PER_PAGE ((PAGE_SIZE - sizeof(void *)) / CHECKSUM_SIZE) -+ -+struct cpu_context { -+ struct crypto_hash *transform; -+ struct hash_desc desc; -+ struct scatterlist sg[2]; -+ char *buf; -+}; -+ -+static DEFINE_PER_CPU(struct cpu_context, contexts); -+static int pages_allocated; -+static unsigned long page_list; -+ -+static int toi_num_resaved; -+ -+static unsigned long this_checksum, next_page; -+static int checksum_index; -+ -+static inline int checksum_pages_needed(void) -+{ -+ return DIV_ROUND_UP(pagedir2.size, CHECKSUMS_PER_PAGE); -+} -+ -+/* ---- Local buffer management ---- */ -+ -+/* -+ * toi_checksum_cleanup -+ * -+ * Frees memory allocated for our labours. -+ */ -+static void toi_checksum_cleanup(int ending_cycle) -+{ -+ int cpu; -+ -+ if (ending_cycle) { -+ for_each_online_cpu(cpu) { -+ struct cpu_context *this = &per_cpu(contexts, cpu); -+ if (this->transform) { -+ crypto_free_hash(this->transform); -+ this->transform = NULL; -+ this->desc.tfm = NULL; -+ } -+ -+ if (this->buf) { -+ toi_free_page(27, (unsigned long) this->buf); -+ this->buf = NULL; -+ } -+ } -+ } -+} -+ -+/* -+ * toi_crypto_initialise -+ * -+ * Prepare to do some work by allocating buffers and transforms. -+ * Returns: Int: Zero. Even if we can't set up checksum, we still -+ * seek to hibernate. -+ */ -+static int toi_checksum_initialise(int starting_cycle) -+{ -+ int cpu; -+ -+ if (!(starting_cycle & SYSFS_HIBERNATE) || !toi_checksum_ops.enabled) -+ return 0; -+ -+ if (!*toi_checksum_name) { -+ printk(KERN_INFO "TuxOnIce: No checksum algorithm name set.\n"); -+ return 1; -+ } -+ -+ for_each_online_cpu(cpu) { -+ struct cpu_context *this = &per_cpu(contexts, cpu); -+ struct page *page; -+ -+ this->transform = crypto_alloc_hash(toi_checksum_name, 0, 0); -+ if (IS_ERR(this->transform)) { -+ printk(KERN_INFO "TuxOnIce: Failed to initialise the " -+ "%s checksum algorithm: %ld.\n", -+ toi_checksum_name, (long) this->transform); -+ this->transform = NULL; -+ return 1; -+ } -+ -+ this->desc.tfm = this->transform; -+ this->desc.flags = 0; -+ -+ page = toi_alloc_page(27, GFP_KERNEL); -+ if (!page) -+ return 1; -+ this->buf = page_address(page); -+ sg_init_one(&this->sg[0], this->buf, PAGE_SIZE); -+ } -+ return 0; -+} -+ -+/* -+ * toi_checksum_print_debug_stats -+ * @buffer: Pointer to a buffer into which the debug info will be printed. -+ * @size: Size of the buffer. -+ * -+ * Print information to be recorded for debugging purposes into a buffer. -+ * Returns: Number of characters written to the buffer. -+ */ -+ -+static int toi_checksum_print_debug_stats(char *buffer, int size) -+{ -+ int len; -+ -+ if (!toi_checksum_ops.enabled) -+ return scnprintf(buffer, size, -+ "- Checksumming disabled.\n"); -+ -+ len = scnprintf(buffer, size, "- Checksum method is '%s'.\n", -+ toi_checksum_name); -+ len += scnprintf(buffer + len, size - len, -+ " %d pages resaved in atomic copy.\n", toi_num_resaved); -+ return len; -+} -+ -+static int toi_checksum_memory_needed(void) -+{ -+ return toi_checksum_ops.enabled ? -+ checksum_pages_needed() << PAGE_SHIFT : 0; -+} -+ -+static int toi_checksum_storage_needed(void) -+{ -+ if (toi_checksum_ops.enabled) -+ return strlen(toi_checksum_name) + sizeof(int) + 1; -+ else -+ return 0; -+} -+ -+/* -+ * toi_checksum_save_config_info -+ * @buffer: Pointer to a buffer of size PAGE_SIZE. -+ * -+ * Save informaton needed when reloading the image at resume time. -+ * Returns: Number of bytes used for saving our data. -+ */ -+static int toi_checksum_save_config_info(char *buffer) -+{ -+ int namelen = strlen(toi_checksum_name) + 1; -+ int total_len; -+ -+ *((unsigned int *) buffer) = namelen; -+ strncpy(buffer + sizeof(unsigned int), toi_checksum_name, namelen); -+ total_len = sizeof(unsigned int) + namelen; -+ return total_len; -+} -+ -+/* toi_checksum_load_config_info -+ * @buffer: Pointer to the start of the data. -+ * @size: Number of bytes that were saved. -+ * -+ * Description: Reload information needed for dechecksuming the image at -+ * resume time. -+ */ -+static void toi_checksum_load_config_info(char *buffer, int size) -+{ -+ int namelen; -+ -+ namelen = *((unsigned int *) (buffer)); -+ strncpy(toi_checksum_name, buffer + sizeof(unsigned int), -+ namelen); -+ return; -+} -+ -+/* -+ * Free Checksum Memory -+ */ -+ -+void free_checksum_pages(void) -+{ -+ while (pages_allocated) { -+ unsigned long next = *((unsigned long *) page_list); -+ ClearPageNosave(virt_to_page(page_list)); -+ toi_free_page(15, (unsigned long) page_list); -+ page_list = next; -+ pages_allocated--; -+ } -+} -+ -+/* -+ * Allocate Checksum Memory -+ */ -+ -+int allocate_checksum_pages(void) -+{ -+ int pages_needed = checksum_pages_needed(); -+ -+ if (!toi_checksum_ops.enabled) -+ return 0; -+ -+ while (pages_allocated < pages_needed) { -+ unsigned long *new_page = -+ (unsigned long *) toi_get_zeroed_page(15, TOI_ATOMIC_GFP); -+ if (!new_page) { -+ printk(KERN_ERR "Unable to allocate checksum pages.\n"); -+ return -ENOMEM; -+ } -+ SetPageNosave(virt_to_page(new_page)); -+ (*new_page) = page_list; -+ page_list = (unsigned long) new_page; -+ pages_allocated++; -+ } -+ -+ next_page = (unsigned long) page_list; -+ checksum_index = 0; -+ -+ return 0; -+} -+ -+char *tuxonice_get_next_checksum(void) -+{ -+ if (!toi_checksum_ops.enabled) -+ return NULL; -+ -+ if (checksum_index % CHECKSUMS_PER_PAGE) -+ this_checksum += CHECKSUM_SIZE; -+ else { -+ this_checksum = next_page + sizeof(void *); -+ next_page = *((unsigned long *) next_page); -+ } -+ -+ checksum_index++; -+ return (char *) this_checksum; -+} -+ -+int tuxonice_calc_checksum(struct page *page, char *checksum_locn) -+{ -+ char *pa; -+ int result, cpu = smp_processor_id(); -+ struct cpu_context *ctx = &per_cpu(contexts, cpu); -+ -+ if (!toi_checksum_ops.enabled) -+ return 0; -+ -+ pa = kmap(page); -+ memcpy(ctx->buf, pa, PAGE_SIZE); -+ kunmap(page); -+ result = crypto_hash_digest(&ctx->desc, ctx->sg, PAGE_SIZE, -+ checksum_locn); -+ if (result) -+ printk(KERN_ERR "TuxOnIce checksumming: crypto_hash_digest " -+ "returned %d.\n", result); -+ return result; -+} -+/* -+ * Calculate checksums -+ */ -+ -+void check_checksums(void) -+{ -+ int pfn, index = 0, cpu = smp_processor_id(); -+ char current_checksum[CHECKSUM_SIZE]; -+ struct cpu_context *ctx = &per_cpu(contexts, cpu); -+ -+ if (!toi_checksum_ops.enabled) -+ return; -+ -+ next_page = (unsigned long) page_list; -+ -+ toi_num_resaved = 0; -+ this_checksum = 0; -+ -+ memory_bm_position_reset(pageset2_map); -+ for (pfn = memory_bm_next_pfn(pageset2_map); pfn != BM_END_OF_MAP; -+ pfn = memory_bm_next_pfn(pageset2_map)) { -+ int ret; -+ char *pa; -+ struct page *page = pfn_to_page(pfn); -+ -+ if (index % CHECKSUMS_PER_PAGE) { -+ this_checksum += CHECKSUM_SIZE; -+ } else { -+ this_checksum = next_page + sizeof(void *); -+ next_page = *((unsigned long *) next_page); -+ } -+ -+ /* Done when IRQs disabled so must be atomic */ -+ pa = kmap_atomic(page, KM_USER1); -+ memcpy(ctx->buf, pa, PAGE_SIZE); -+ kunmap_atomic(pa, KM_USER1); -+ ret = crypto_hash_digest(&ctx->desc, ctx->sg, PAGE_SIZE, -+ current_checksum); -+ -+ if (ret) { -+ printk(KERN_INFO "Digest failed. Returned %d.\n", ret); -+ return; -+ } -+ -+ if (memcmp(current_checksum, (char *) this_checksum, -+ CHECKSUM_SIZE)) { -+ SetPageResave(pfn_to_page(pfn)); -+ toi_num_resaved++; -+ if (test_action_state(TOI_ABORT_ON_RESAVE_NEEDED)) -+ set_abort_result(TOI_RESAVE_NEEDED); -+ } -+ -+ index++; -+ } -+} -+ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_INT("enabled", SYSFS_RW, &toi_checksum_ops.enabled, 0, 1, 0, -+ NULL), -+ SYSFS_BIT("abort_if_resave_needed", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_ABORT_ON_RESAVE_NEEDED, 0) -+}; -+ -+/* -+ * Ops structure. -+ */ -+static struct toi_module_ops toi_checksum_ops = { -+ .type = MISC_MODULE, -+ .name = "checksumming", -+ .directory = "checksum", -+ .module = THIS_MODULE, -+ .initialise = toi_checksum_initialise, -+ .cleanup = toi_checksum_cleanup, -+ .print_debug_info = toi_checksum_print_debug_stats, -+ .save_config_info = toi_checksum_save_config_info, -+ .load_config_info = toi_checksum_load_config_info, -+ .memory_needed = toi_checksum_memory_needed, -+ .storage_needed = toi_checksum_storage_needed, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+/* ---- Registration ---- */ -+int toi_checksum_init(void) -+{ -+ int result = toi_register_module(&toi_checksum_ops); -+ return result; -+} -+ -+void toi_checksum_exit(void) -+{ -+ toi_unregister_module(&toi_checksum_ops); -+} -diff --git a/kernel/power/tuxonice_checksum.h b/kernel/power/tuxonice_checksum.h -new file mode 100644 -index 0000000..0f2812e ---- /dev/null -+++ b/kernel/power/tuxonice_checksum.h -@@ -0,0 +1,31 @@ -+/* -+ * kernel/power/tuxonice_checksum.h -+ * -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * This file contains data checksum routines for TuxOnIce, -+ * using cryptoapi. They are used to locate any modifications -+ * made to pageset 2 while we're saving it. -+ */ -+ -+#if defined(CONFIG_TOI_CHECKSUM) -+extern int toi_checksum_init(void); -+extern void toi_checksum_exit(void); -+void check_checksums(void); -+int allocate_checksum_pages(void); -+void free_checksum_pages(void); -+char *tuxonice_get_next_checksum(void); -+int tuxonice_calc_checksum(struct page *page, char *checksum_locn); -+#else -+static inline int toi_checksum_init(void) { return 0; } -+static inline void toi_checksum_exit(void) { } -+static inline void check_checksums(void) { }; -+static inline int allocate_checksum_pages(void) { return 0; }; -+static inline void free_checksum_pages(void) { }; -+static inline char *tuxonice_get_next_checksum(void) { return NULL; }; -+static inline int tuxonice_calc_checksum(struct page *page, char *checksum_locn) -+ { return 0; } -+#endif -+ -diff --git a/kernel/power/tuxonice_cluster.c b/kernel/power/tuxonice_cluster.c -new file mode 100644 -index 0000000..0e5a262 ---- /dev/null -+++ b/kernel/power/tuxonice_cluster.c -@@ -0,0 +1,1069 @@ -+/* -+ * kernel/power/tuxonice_cluster.c -+ * -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * This file contains routines for cluster hibernation support. -+ * -+ * Based on ip autoconfiguration code in net/ipv4/ipconfig.c. -+ * -+ * How does it work? -+ * -+ * There is no 'master' node that tells everyone else what to do. All nodes -+ * send messages to the broadcast address/port, maintain a list of peers -+ * and figure out when to progress to the next step in hibernating or resuming. -+ * This makes us more fault tolerant when it comes to nodes coming and going -+ * (which may be more of an issue if we're hibernating when power supplies -+ * are being unreliable). -+ * -+ * At boot time, we start a ktuxonice thread that handles communication with -+ * other nodes. This node maintains a state machine that controls our progress -+ * through hibernating and resuming, keeping us in step with other nodes. Nodes -+ * are identified by their hw address. -+ * -+ * On startup, the node sends CLUSTER_PING on the configured interface's -+ * broadcast address, port $toi_cluster_port (see below) and begins to listen -+ * for other broadcast messages. CLUSTER_PING messages are repeated at -+ * intervals of 5 minutes, with a random offset to spread traffic out. -+ * -+ * A hibernation cycle is initiated from any node via -+ * -+ * echo > /sys/power/tuxonice/do_hibernate -+ * -+ * and (possibily) the hibernate script. At each step of the process, the node -+ * completes its work, and waits for all other nodes to signal completion of -+ * their work (or timeout) before progressing to the next step. -+ * -+ * Request/state Action before reply Possible reply Next state -+ * HIBERNATE capable, pre-script HIBERNATE|ACK NODE_PREP -+ * HIBERNATE|NACK INIT_0 -+ * -+ * PREP prepare_image PREP|ACK IMAGE_WRITE -+ * PREP|NACK INIT_0 -+ * ABORT RUNNING -+ * -+ * IO write image IO|ACK power off -+ * ABORT POST_RESUME -+ * -+ * (Boot time) check for image IMAGE|ACK RESUME_PREP -+ * (Note 1) -+ * IMAGE|NACK (Note 2) -+ * -+ * PREP prepare read image PREP|ACK IMAGE_READ -+ * PREP|NACK (As NACK_IMAGE) -+ * -+ * IO read image IO|ACK POST_RESUME -+ * -+ * POST_RESUME thaw, post-script RUNNING -+ * -+ * INIT_0 init 0 -+ * -+ * Other messages: -+ * -+ * - PING: Request for all other live nodes to send a PONG. Used at startup to -+ * announce presence, when a node is suspected dead and periodically, in case -+ * segments of the network are [un]plugged. -+ * -+ * - PONG: Response to a PING. -+ * -+ * - ABORT: Request to cancel writing an image. -+ * -+ * - BYE: Notification that this node is shutting down. -+ * -+ * Note 1: Repeated at 3s intervals until we continue to boot/resume, so that -+ * nodes which are slower to start up can get state synchronised. If a node -+ * starting up sees other nodes sending RESUME_PREP or IMAGE_READ, it may send -+ * ACK_IMAGE and they will wait for it to catch up. If it sees ACK_READ, it -+ * must invalidate its image (if any) and boot normally. -+ * -+ * Note 2: May occur when one node lost power or powered off while others -+ * hibernated. This node waits for others to complete resuming (ACK_READ) -+ * before completing its boot, so that it appears as a fail node restarting. -+ * -+ * If any node has an image, then it also has a list of nodes that hibernated -+ * in synchronisation with it. The node will wait for other nodes to appear -+ * or timeout before beginning its restoration. -+ * -+ * If a node has no image, it needs to wait, in case other nodes which do have -+ * an image are going to resume, but are taking longer to announce their -+ * presence. For this reason, the user can specify a timeout value and a number -+ * of nodes detected before we just continue. (We might want to assume in a -+ * cluster of, say, 15 nodes, if 8 others have booted without finding an image, -+ * the remaining nodes will too. This might help in situations where some nodes -+ * are much slower to boot, or more subject to hardware failures or such like). -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_io.h" -+ -+#if 1 -+#define PRINTK(a, b...) do { printk(a, ##b); } while (0) -+#else -+#define PRINTK(a, b...) do { } while (0) -+#endif -+ -+static int loopback_mode; -+static int num_local_nodes = 1; -+#define MAX_LOCAL_NODES 8 -+#define SADDR (loopback_mode ? b->sid : h->saddr) -+ -+#define MYNAME "TuxOnIce Clustering" -+ -+enum cluster_message { -+ MSG_ACK = 1, -+ MSG_NACK = 2, -+ MSG_PING = 4, -+ MSG_ABORT = 8, -+ MSG_BYE = 16, -+ MSG_HIBERNATE = 32, -+ MSG_IMAGE = 64, -+ MSG_IO = 128, -+ MSG_RUNNING = 256 -+}; -+ -+static char *str_message(int message) -+{ -+ switch (message) { -+ case 4: -+ return "Ping"; -+ case 8: -+ return "Abort"; -+ case 9: -+ return "Abort acked"; -+ case 10: -+ return "Abort nacked"; -+ case 16: -+ return "Bye"; -+ case 17: -+ return "Bye acked"; -+ case 18: -+ return "Bye nacked"; -+ case 32: -+ return "Hibernate request"; -+ case 33: -+ return "Hibernate ack"; -+ case 34: -+ return "Hibernate nack"; -+ case 64: -+ return "Image exists?"; -+ case 65: -+ return "Image does exist"; -+ case 66: -+ return "No image here"; -+ case 128: -+ return "I/O"; -+ case 129: -+ return "I/O okay"; -+ case 130: -+ return "I/O failed"; -+ case 256: -+ return "Running"; -+ default: -+ printk(KERN_ERR "Unrecognised message %d.\n", message); -+ return "Unrecognised message (see dmesg)"; -+ } -+} -+ -+#define MSG_ACK_MASK (MSG_ACK | MSG_NACK) -+#define MSG_STATE_MASK (~MSG_ACK_MASK) -+ -+struct node_info { -+ struct list_head member_list; -+ wait_queue_head_t member_events; -+ spinlock_t member_list_lock; -+ spinlock_t receive_lock; -+ int peer_count, ignored_peer_count; -+ struct toi_sysfs_data sysfs_data; -+ enum cluster_message current_message; -+}; -+ -+struct node_info node_array[MAX_LOCAL_NODES]; -+ -+struct cluster_member { -+ __be32 addr; -+ enum cluster_message message; -+ struct list_head list; -+ int ignore; -+}; -+ -+#define toi_cluster_port_send 3501 -+#define toi_cluster_port_recv 3502 -+ -+static struct net_device *net_dev; -+static struct toi_module_ops toi_cluster_ops; -+ -+static int toi_recv(struct sk_buff *skb, struct net_device *dev, -+ struct packet_type *pt, struct net_device *orig_dev); -+ -+static struct packet_type toi_cluster_packet_type = { -+ .type = __constant_htons(ETH_P_IP), -+ .func = toi_recv, -+}; -+ -+struct toi_pkt { /* BOOTP packet format */ -+ struct iphdr iph; /* IP header */ -+ struct udphdr udph; /* UDP header */ -+ u8 htype; /* HW address type */ -+ u8 hlen; /* HW address length */ -+ __be32 xid; /* Transaction ID */ -+ __be16 secs; /* Seconds since we started */ -+ __be16 flags; /* Just what it says */ -+ u8 hw_addr[16]; /* Sender's HW address */ -+ u16 message; /* Message */ -+ unsigned long sid; /* Source ID for loopback testing */ -+}; -+ -+static char toi_cluster_iface[IFNAMSIZ] = CONFIG_TOI_DEFAULT_CLUSTER_INTERFACE; -+ -+static int added_pack; -+ -+static int others_have_image; -+ -+/* Key used to allow multiple clusters on the same lan */ -+static char toi_cluster_key[32] = CONFIG_TOI_DEFAULT_CLUSTER_KEY; -+static char pre_hibernate_script[255] = -+ CONFIG_TOI_DEFAULT_CLUSTER_PRE_HIBERNATE; -+static char post_hibernate_script[255] = -+ CONFIG_TOI_DEFAULT_CLUSTER_POST_HIBERNATE; -+ -+/* List of cluster members */ -+static unsigned long continue_delay = 5 * HZ; -+static unsigned long cluster_message_timeout = 3 * HZ; -+ -+/* === Membership list === */ -+ -+static void print_member_info(int index) -+{ -+ struct cluster_member *this; -+ -+ printk(KERN_INFO "==> Dumping node %d.\n", index); -+ -+ list_for_each_entry(this, &node_array[index].member_list, list) -+ printk(KERN_INFO "%d.%d.%d.%d last message %s. %s\n", -+ NIPQUAD(this->addr), -+ str_message(this->message), -+ this->ignore ? "(Ignored)" : ""); -+ printk(KERN_INFO "== Done ==\n"); -+} -+ -+static struct cluster_member *__find_member(int index, __be32 addr) -+{ -+ struct cluster_member *this; -+ -+ list_for_each_entry(this, &node_array[index].member_list, list) { -+ if (this->addr != addr) -+ continue; -+ -+ return this; -+ } -+ -+ return NULL; -+} -+ -+static void set_ignore(int index, __be32 addr, struct cluster_member *this) -+{ -+ if (this->ignore) { -+ PRINTK("Node %d already ignoring %d.%d.%d.%d.\n", -+ index, NIPQUAD(addr)); -+ return; -+ } -+ -+ PRINTK("Node %d sees node %d.%d.%d.%d now being ignored.\n", -+ index, NIPQUAD(addr)); -+ this->ignore = 1; -+ node_array[index].ignored_peer_count++; -+} -+ -+static int __add_update_member(int index, __be32 addr, int message) -+{ -+ struct cluster_member *this; -+ -+ this = __find_member(index, addr); -+ if (this) { -+ if (this->message != message) { -+ this->message = message; -+ if ((message & MSG_NACK) && -+ (message & (MSG_HIBERNATE | MSG_IMAGE | MSG_IO))) -+ set_ignore(index, addr, this); -+ PRINTK("Node %d sees node %d.%d.%d.%d now sending " -+ "%s.\n", index, NIPQUAD(addr), -+ str_message(message)); -+ wake_up(&node_array[index].member_events); -+ } -+ return 0; -+ } -+ -+ this = (struct cluster_member *) toi_kzalloc(36, -+ sizeof(struct cluster_member), GFP_KERNEL); -+ -+ if (!this) -+ return -1; -+ -+ this->addr = addr; -+ this->message = message; -+ this->ignore = 0; -+ INIT_LIST_HEAD(&this->list); -+ -+ node_array[index].peer_count++; -+ -+ PRINTK("Node %d sees node %d.%d.%d.%d sending %s.\n", index, -+ NIPQUAD(addr), str_message(message)); -+ -+ if ((message & MSG_NACK) && -+ (message & (MSG_HIBERNATE | MSG_IMAGE | MSG_IO))) -+ set_ignore(index, addr, this); -+ list_add_tail(&this->list, &node_array[index].member_list); -+ return 1; -+} -+ -+static int add_update_member(int index, __be32 addr, int message) -+{ -+ int result; -+ unsigned long flags; -+ spin_lock_irqsave(&node_array[index].member_list_lock, flags); -+ result = __add_update_member(index, addr, message); -+ spin_unlock_irqrestore(&node_array[index].member_list_lock, flags); -+ -+ print_member_info(index); -+ -+ wake_up(&node_array[index].member_events); -+ -+ return result; -+} -+ -+static void del_member(int index, __be32 addr) -+{ -+ struct cluster_member *this; -+ unsigned long flags; -+ -+ spin_lock_irqsave(&node_array[index].member_list_lock, flags); -+ this = __find_member(index, addr); -+ -+ if (this) { -+ list_del_init(&this->list); -+ toi_kfree(36, this, sizeof(*this)); -+ node_array[index].peer_count--; -+ } -+ -+ spin_unlock_irqrestore(&node_array[index].member_list_lock, flags); -+} -+ -+/* === Message transmission === */ -+ -+static void toi_send_if(int message, unsigned long my_id); -+ -+/* -+ * Process received TOI packet. -+ */ -+static int toi_recv(struct sk_buff *skb, struct net_device *dev, -+ struct packet_type *pt, struct net_device *orig_dev) -+{ -+ struct toi_pkt *b; -+ struct iphdr *h; -+ int len, result, index; -+ unsigned long addr, message, ack; -+ -+ /* Perform verifications before taking the lock. */ -+ if (skb->pkt_type == PACKET_OTHERHOST) -+ goto drop; -+ -+ if (dev != net_dev) -+ goto drop; -+ -+ skb = skb_share_check(skb, GFP_ATOMIC); -+ if (!skb) -+ return NET_RX_DROP; -+ -+ if (!pskb_may_pull(skb, -+ sizeof(struct iphdr) + -+ sizeof(struct udphdr))) -+ goto drop; -+ -+ b = (struct toi_pkt *)skb_network_header(skb); -+ h = &b->iph; -+ -+ if (h->ihl != 5 || h->version != 4 || h->protocol != IPPROTO_UDP) -+ goto drop; -+ -+ /* Fragments are not supported */ -+ if (h->frag_off & htons(IP_OFFSET | IP_MF)) { -+ if (net_ratelimit()) -+ printk(KERN_ERR "TuxOnIce: Ignoring fragmented " -+ "cluster message.\n"); -+ goto drop; -+ } -+ -+ if (skb->len < ntohs(h->tot_len)) -+ goto drop; -+ -+ if (ip_fast_csum((char *) h, h->ihl)) -+ goto drop; -+ -+ if (b->udph.source != htons(toi_cluster_port_send) || -+ b->udph.dest != htons(toi_cluster_port_recv)) -+ goto drop; -+ -+ if (ntohs(h->tot_len) < ntohs(b->udph.len) + sizeof(struct iphdr)) -+ goto drop; -+ -+ len = ntohs(b->udph.len) - sizeof(struct udphdr); -+ -+ /* Ok the front looks good, make sure we can get at the rest. */ -+ if (!pskb_may_pull(skb, skb->len)) -+ goto drop; -+ -+ b = (struct toi_pkt *)skb_network_header(skb); -+ h = &b->iph; -+ -+ addr = SADDR; -+ PRINTK(">>> Message %s received from " NIPQUAD_FMT ".\n", -+ str_message(b->message), NIPQUAD(addr)); -+ -+ message = b->message & MSG_STATE_MASK; -+ ack = b->message & MSG_ACK_MASK; -+ -+ for (index = 0; index < num_local_nodes; index++) { -+ int new_message = node_array[index].current_message, -+ old_message = new_message; -+ -+ if (index == SADDR || !old_message) { -+ PRINTK("Ignoring node %d (offline or self).\n", index); -+ continue; -+ } -+ -+ /* One message at a time, please. */ -+ spin_lock(&node_array[index].receive_lock); -+ -+ result = add_update_member(index, SADDR, b->message); -+ if (result == -1) { -+ printk(KERN_INFO "Failed to add new cluster member " -+ NIPQUAD_FMT ".\n", -+ NIPQUAD(addr)); -+ goto drop_unlock; -+ } -+ -+ switch (b->message & MSG_STATE_MASK) { -+ case MSG_PING: -+ break; -+ case MSG_ABORT: -+ break; -+ case MSG_BYE: -+ break; -+ case MSG_HIBERNATE: -+ /* Can I hibernate? */ -+ new_message = MSG_HIBERNATE | -+ ((index & 1) ? MSG_NACK : MSG_ACK); -+ break; -+ case MSG_IMAGE: -+ /* Can I resume? */ -+ new_message = MSG_IMAGE | -+ ((index & 1) ? MSG_NACK : MSG_ACK); -+ if (new_message != old_message) -+ printk(KERN_ERR "Setting whether I can resume " -+ "to %d.\n", new_message); -+ break; -+ case MSG_IO: -+ new_message = MSG_IO | MSG_ACK; -+ break; -+ case MSG_RUNNING: -+ break; -+ default: -+ if (net_ratelimit()) -+ printk(KERN_ERR "Unrecognised TuxOnIce cluster" -+ " message %d from " NIPQUAD_FMT ".\n", -+ b->message, NIPQUAD(addr)); -+ }; -+ -+ if (old_message != new_message) { -+ node_array[index].current_message = new_message; -+ printk(KERN_INFO ">>> Sending new message for node " -+ "%d.\n", index); -+ toi_send_if(new_message, index); -+ } else if (!ack) { -+ printk(KERN_INFO ">>> Resending message for node %d.\n", -+ index); -+ toi_send_if(new_message, index); -+ } -+drop_unlock: -+ spin_unlock(&node_array[index].receive_lock); -+ }; -+ -+drop: -+ /* Throw the packet out. */ -+ kfree_skb(skb); -+ -+ return 0; -+} -+ -+/* -+ * Send cluster message to single interface. -+ */ -+static void toi_send_if(int message, unsigned long my_id) -+{ -+ struct sk_buff *skb; -+ struct toi_pkt *b; -+ int hh_len = LL_RESERVED_SPACE(net_dev); -+ struct iphdr *h; -+ -+ /* Allocate packet */ -+ skb = alloc_skb(sizeof(struct toi_pkt) + hh_len + 15, GFP_KERNEL); -+ if (!skb) -+ return; -+ skb_reserve(skb, hh_len); -+ b = (struct toi_pkt *) skb_put(skb, sizeof(struct toi_pkt)); -+ memset(b, 0, sizeof(struct toi_pkt)); -+ -+ /* Construct IP header */ -+ skb_reset_network_header(skb); -+ h = ip_hdr(skb); -+ h->version = 4; -+ h->ihl = 5; -+ h->tot_len = htons(sizeof(struct toi_pkt)); -+ h->frag_off = htons(IP_DF); -+ h->ttl = 64; -+ h->protocol = IPPROTO_UDP; -+ h->daddr = htonl(INADDR_BROADCAST); -+ h->check = ip_fast_csum((unsigned char *) h, h->ihl); -+ -+ /* Construct UDP header */ -+ b->udph.source = htons(toi_cluster_port_send); -+ b->udph.dest = htons(toi_cluster_port_recv); -+ b->udph.len = htons(sizeof(struct toi_pkt) - sizeof(struct iphdr)); -+ /* UDP checksum not calculated -- explicitly allowed in BOOTP RFC */ -+ -+ /* Construct message */ -+ b->message = message; -+ b->sid = my_id; -+ b->htype = net_dev->type; /* can cause undefined behavior */ -+ b->hlen = net_dev->addr_len; -+ memcpy(b->hw_addr, net_dev->dev_addr, net_dev->addr_len); -+ b->secs = htons(3); /* 3 seconds */ -+ -+ /* Chain packet down the line... */ -+ skb->dev = net_dev; -+ skb->protocol = htons(ETH_P_IP); -+ if ((dev_hard_header(skb, net_dev, ntohs(skb->protocol), -+ net_dev->broadcast, net_dev->dev_addr, skb->len) < 0) || -+ dev_queue_xmit(skb) < 0) -+ printk(KERN_INFO "E"); -+} -+ -+/* ========================================= */ -+ -+/* kTOICluster */ -+ -+static atomic_t num_cluster_threads; -+static DECLARE_WAIT_QUEUE_HEAD(clusterd_events); -+ -+static int kTOICluster(void *data) -+{ -+ unsigned long my_id; -+ -+ my_id = atomic_add_return(1, &num_cluster_threads) - 1; -+ node_array[my_id].current_message = (unsigned long) data; -+ -+ PRINTK("kTOICluster daemon %lu starting.\n", my_id); -+ -+ current->flags |= PF_NOFREEZE; -+ -+ while (node_array[my_id].current_message) { -+ toi_send_if(node_array[my_id].current_message, my_id); -+ sleep_on_timeout(&clusterd_events, -+ cluster_message_timeout); -+ PRINTK("Link state %lu is %d.\n", my_id, -+ node_array[my_id].current_message); -+ } -+ -+ toi_send_if(MSG_BYE, my_id); -+ atomic_dec(&num_cluster_threads); -+ wake_up(&clusterd_events); -+ -+ PRINTK("kTOICluster daemon %lu exiting.\n", my_id); -+ __set_current_state(TASK_RUNNING); -+ return 0; -+} -+ -+static void kill_clusterd(void) -+{ -+ int i; -+ -+ for (i = 0; i < num_local_nodes; i++) { -+ if (node_array[i].current_message) { -+ PRINTK("Seeking to kill clusterd %d.\n", i); -+ node_array[i].current_message = 0; -+ } -+ } -+ wait_event(clusterd_events, -+ !atomic_read(&num_cluster_threads)); -+ PRINTK("All cluster daemons have exited.\n"); -+} -+ -+static int peers_not_in_message(int index, int message, int precise) -+{ -+ struct cluster_member *this; -+ unsigned long flags; -+ int result = 0; -+ -+ spin_lock_irqsave(&node_array[index].member_list_lock, flags); -+ list_for_each_entry(this, &node_array[index].member_list, list) { -+ if (this->ignore) -+ continue; -+ -+ PRINTK("Peer %d.%d.%d.%d sending %s. " -+ "Seeking %s.\n", -+ NIPQUAD(this->addr), -+ str_message(this->message), str_message(message)); -+ if ((precise ? this->message : -+ this->message & MSG_STATE_MASK) != -+ message) -+ result++; -+ } -+ spin_unlock_irqrestore(&node_array[index].member_list_lock, flags); -+ PRINTK("%d peers in sought message.\n", result); -+ return result; -+} -+ -+static void reset_ignored(int index) -+{ -+ struct cluster_member *this; -+ unsigned long flags; -+ -+ spin_lock_irqsave(&node_array[index].member_list_lock, flags); -+ list_for_each_entry(this, &node_array[index].member_list, list) -+ this->ignore = 0; -+ node_array[index].ignored_peer_count = 0; -+ spin_unlock_irqrestore(&node_array[index].member_list_lock, flags); -+} -+ -+static int peers_in_message(int index, int message, int precise) -+{ -+ return node_array[index].peer_count - -+ node_array[index].ignored_peer_count - -+ peers_not_in_message(index, message, precise); -+} -+ -+static int time_to_continue(int index, unsigned long start, int message) -+{ -+ int first = peers_not_in_message(index, message, 0); -+ int second = peers_in_message(index, message, 1); -+ -+ PRINTK("First part returns %d, second returns %d.\n", first, second); -+ -+ if (!first && !second) { -+ PRINTK("All peers answered message %d.\n", -+ message); -+ return 1; -+ } -+ -+ if (time_after(jiffies, start + continue_delay)) { -+ PRINTK("Timeout reached.\n"); -+ return 1; -+ } -+ -+ PRINTK("Not time to continue yet (%lu < %lu).\n", jiffies, -+ start + continue_delay); -+ return 0; -+} -+ -+void toi_initiate_cluster_hibernate(void) -+{ -+ int result; -+ unsigned long start; -+ -+ result = do_toi_step(STEP_HIBERNATE_PREPARE_IMAGE); -+ if (result) -+ return; -+ -+ toi_send_if(MSG_HIBERNATE, 0); -+ -+ start = jiffies; -+ wait_event(node_array[0].member_events, -+ time_to_continue(0, start, MSG_HIBERNATE)); -+ -+ if (test_action_state(TOI_FREEZER_TEST)) { -+ toi_send_if(MSG_ABORT, 0); -+ -+ start = jiffies; -+ wait_event(node_array[0].member_events, -+ time_to_continue(0, start, MSG_RUNNING)); -+ -+ do_toi_step(STEP_QUIET_CLEANUP); -+ return; -+ } -+ -+ toi_send_if(MSG_IO, 0); -+ -+ result = do_toi_step(STEP_HIBERNATE_SAVE_IMAGE); -+ if (result) -+ return; -+ -+ /* This code runs at resume time too! */ -+ if (toi_in_hibernate) -+ result = do_toi_step(STEP_HIBERNATE_POWERDOWN); -+} -+EXPORT_SYMBOL_GPL(toi_initiate_cluster_hibernate); -+ -+/* toi_cluster_print_debug_stats -+ * -+ * Description: Print information to be recorded for debugging purposes into a -+ * buffer. -+ * Arguments: buffer: Pointer to a buffer into which the debug info will be -+ * printed. -+ * size: Size of the buffer. -+ * Returns: Number of characters written to the buffer. -+ */ -+static int toi_cluster_print_debug_stats(char *buffer, int size) -+{ -+ int len; -+ -+ if (strlen(toi_cluster_iface)) -+ len = scnprintf(buffer, size, -+ "- Cluster interface is '%s'.\n", -+ toi_cluster_iface); -+ else -+ len = scnprintf(buffer, size, -+ "- Cluster support is disabled.\n"); -+ return len; -+} -+ -+/* cluster_memory_needed -+ * -+ * Description: Tell the caller how much memory we need to operate during -+ * hibernate/resume. -+ * Returns: Unsigned long. Maximum number of bytes of memory required for -+ * operation. -+ */ -+static int toi_cluster_memory_needed(void) -+{ -+ return 0; -+} -+ -+static int toi_cluster_storage_needed(void) -+{ -+ return 1 + strlen(toi_cluster_iface); -+} -+ -+/* toi_cluster_save_config_info -+ * -+ * Description: Save informaton needed when reloading the image at resume time. -+ * Arguments: Buffer: Pointer to a buffer of size PAGE_SIZE. -+ * Returns: Number of bytes used for saving our data. -+ */ -+static int toi_cluster_save_config_info(char *buffer) -+{ -+ strcpy(buffer, toi_cluster_iface); -+ return strlen(toi_cluster_iface + 1); -+} -+ -+/* toi_cluster_load_config_info -+ * -+ * Description: Reload information needed for declustering the image at -+ * resume time. -+ * Arguments: Buffer: Pointer to the start of the data. -+ * Size: Number of bytes that were saved. -+ */ -+static void toi_cluster_load_config_info(char *buffer, int size) -+{ -+ strncpy(toi_cluster_iface, buffer, size); -+ return; -+} -+ -+static void cluster_startup(void) -+{ -+ int have_image = do_check_can_resume(), i; -+ unsigned long start = jiffies, initial_message; -+ struct task_struct *p; -+ -+ initial_message = MSG_IMAGE; -+ -+ have_image = 1; -+ -+ for (i = 0; i < num_local_nodes; i++) { -+ PRINTK("Starting ktoiclusterd %d.\n", i); -+ p = kthread_create(kTOICluster, (void *) initial_message, -+ "ktoiclusterd/%d", i); -+ if (IS_ERR(p)) { -+ printk(KERN_ERR "Failed to start ktoiclusterd.\n"); -+ return; -+ } -+ -+ wake_up_process(p); -+ } -+ -+ /* Wait for delay or someone else sending first message */ -+ wait_event(node_array[0].member_events, time_to_continue(0, start, -+ MSG_IMAGE)); -+ -+ others_have_image = peers_in_message(0, MSG_IMAGE | MSG_ACK, 1); -+ -+ printk(KERN_INFO "Continuing. I %shave an image. Peers with image:" -+ " %d.\n", have_image ? "" : "don't ", others_have_image); -+ -+ if (have_image) { -+ int result; -+ -+ /* Start to resume */ -+ printk(KERN_INFO " === Starting to resume === \n"); -+ node_array[0].current_message = MSG_IO; -+ toi_send_if(MSG_IO, 0); -+ -+ /* result = do_toi_step(STEP_RESUME_LOAD_PS1); */ -+ result = 0; -+ -+ if (!result) { -+ /* -+ * Atomic restore - we'll come back in the hibernation -+ * path. -+ */ -+ -+ /* result = do_toi_step(STEP_RESUME_DO_RESTORE); */ -+ result = 0; -+ -+ /* do_toi_step(STEP_QUIET_CLEANUP); */ -+ } -+ -+ node_array[0].current_message |= MSG_NACK; -+ -+ /* For debugging - disable for real life? */ -+ wait_event(node_array[0].member_events, -+ time_to_continue(0, start, MSG_IO)); -+ } -+ -+ if (others_have_image) { -+ /* Wait for them to resume */ -+ printk(KERN_INFO "Waiting for other nodes to resume.\n"); -+ start = jiffies; -+ wait_event(node_array[0].member_events, -+ time_to_continue(0, start, MSG_RUNNING)); -+ if (peers_not_in_message(0, MSG_RUNNING, 0)) -+ printk(KERN_INFO "Timed out while waiting for other " -+ "nodes to resume.\n"); -+ } -+ -+ /* Find out whether an image exists here. Send ACK_IMAGE or NACK_IMAGE -+ * as appropriate. -+ * -+ * If we don't have an image: -+ * - Wait until someone else says they have one, or conditions are met -+ * for continuing to boot (n machines or t seconds). -+ * - If anyone has an image, wait for them to resume before continuing -+ * to boot. -+ * -+ * If we have an image: -+ * - Wait until conditions are met before continuing to resume (n -+ * machines or t seconds). Send RESUME_PREP and freeze processes. -+ * NACK_PREP if freezing fails (shouldn't) and follow logic for -+ * us having no image above. On success, wait for [N]ACK_PREP from -+ * other machines. Read image (including atomic restore) until done. -+ * Wait for ACK_READ from others (should never fail). Thaw processes -+ * and do post-resume. (The section after the atomic restore is done -+ * via the code for hibernating). -+ */ -+ -+ node_array[0].current_message = MSG_RUNNING; -+} -+ -+/* toi_cluster_open_iface -+ * -+ * Description: Prepare to use an interface. -+ */ -+ -+static int toi_cluster_open_iface(void) -+{ -+ struct net_device *dev; -+ -+ rtnl_lock(); -+ -+ for_each_netdev(&init_net, dev) { -+ if (/* dev == &init_net.loopback_dev || */ -+ strcmp(dev->name, toi_cluster_iface)) -+ continue; -+ -+ net_dev = dev; -+ break; -+ } -+ -+ rtnl_unlock(); -+ -+ if (!net_dev) { -+ printk(KERN_ERR MYNAME ": Device %s not found.\n", -+ toi_cluster_iface); -+ return -ENODEV; -+ } -+ -+ dev_add_pack(&toi_cluster_packet_type); -+ added_pack = 1; -+ -+ loopback_mode = (net_dev == init_net.loopback_dev); -+ num_local_nodes = loopback_mode ? 8 : 1; -+ -+ PRINTK("Loopback mode is %s. Number of local nodes is %d.\n", -+ loopback_mode ? "on" : "off", num_local_nodes); -+ -+ cluster_startup(); -+ return 0; -+} -+ -+/* toi_cluster_close_iface -+ * -+ * Description: Stop using an interface. -+ */ -+ -+static int toi_cluster_close_iface(void) -+{ -+ kill_clusterd(); -+ if (added_pack) { -+ dev_remove_pack(&toi_cluster_packet_type); -+ added_pack = 0; -+ } -+ return 0; -+} -+ -+static void write_side_effect(void) -+{ -+ if (toi_cluster_ops.enabled) { -+ toi_cluster_open_iface(); -+ set_toi_state(TOI_CLUSTER_MODE); -+ } else { -+ toi_cluster_close_iface(); -+ clear_toi_state(TOI_CLUSTER_MODE); -+ } -+} -+ -+static void node_write_side_effect(void) -+{ -+} -+ -+/* -+ * data for our sysfs entries. -+ */ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_STRING("interface", SYSFS_RW, toi_cluster_iface, IFNAMSIZ, 0, -+ NULL), -+ SYSFS_INT("enabled", SYSFS_RW, &toi_cluster_ops.enabled, 0, 1, 0, -+ write_side_effect), -+ SYSFS_STRING("cluster_name", SYSFS_RW, toi_cluster_key, 32, 0, NULL), -+ SYSFS_STRING("pre-hibernate-script", SYSFS_RW, pre_hibernate_script, -+ 256, 0, NULL), -+ SYSFS_STRING("post-hibernate-script", SYSFS_RW, post_hibernate_script, -+ 256, 0, STRING), -+ SYSFS_UL("continue_delay", SYSFS_RW, &continue_delay, HZ / 2, 60 * HZ, -+ 0) -+}; -+ -+/* -+ * Ops structure. -+ */ -+ -+static struct toi_module_ops toi_cluster_ops = { -+ .type = FILTER_MODULE, -+ .name = "Cluster", -+ .directory = "cluster", -+ .module = THIS_MODULE, -+ .memory_needed = toi_cluster_memory_needed, -+ .print_debug_info = toi_cluster_print_debug_stats, -+ .save_config_info = toi_cluster_save_config_info, -+ .load_config_info = toi_cluster_load_config_info, -+ .storage_needed = toi_cluster_storage_needed, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+/* ---- Registration ---- */ -+ -+#ifdef MODULE -+#define INIT static __init -+#define EXIT static __exit -+#else -+#define INIT -+#define EXIT -+#endif -+ -+INIT int toi_cluster_init(void) -+{ -+ int temp = toi_register_module(&toi_cluster_ops), i; -+ struct kobject *kobj = toi_cluster_ops.dir_kobj; -+ -+ for (i = 0; i < MAX_LOCAL_NODES; i++) { -+ node_array[i].current_message = 0; -+ INIT_LIST_HEAD(&node_array[i].member_list); -+ init_waitqueue_head(&node_array[i].member_events); -+ spin_lock_init(&node_array[i].member_list_lock); -+ spin_lock_init(&node_array[i].receive_lock); -+ -+ /* Set up sysfs entry */ -+ node_array[i].sysfs_data.attr.name = toi_kzalloc(8, -+ sizeof(node_array[i].sysfs_data.attr.name), -+ GFP_KERNEL); -+ sprintf((char *) node_array[i].sysfs_data.attr.name, "node_%d", -+ i); -+ node_array[i].sysfs_data.attr.mode = SYSFS_RW; -+ node_array[i].sysfs_data.type = TOI_SYSFS_DATA_INTEGER; -+ node_array[i].sysfs_data.flags = 0; -+ node_array[i].sysfs_data.data.integer.variable = -+ (int *) &node_array[i].current_message; -+ node_array[i].sysfs_data.data.integer.minimum = 0; -+ node_array[i].sysfs_data.data.integer.maximum = INT_MAX; -+ node_array[i].sysfs_data.write_side_effect = -+ node_write_side_effect; -+ toi_register_sysfs_file(kobj, &node_array[i].sysfs_data); -+ } -+ -+ toi_cluster_ops.enabled = (strlen(toi_cluster_iface) > 0); -+ -+ if (toi_cluster_ops.enabled) -+ toi_cluster_open_iface(); -+ -+ return temp; -+} -+ -+EXIT void toi_cluster_exit(void) -+{ -+ int i; -+ toi_cluster_close_iface(); -+ -+ for (i = 0; i < MAX_LOCAL_NODES; i++) -+ toi_unregister_sysfs_file(toi_cluster_ops.dir_kobj, -+ &node_array[i].sysfs_data); -+ toi_unregister_module(&toi_cluster_ops); -+} -+ -+static int __init toi_cluster_iface_setup(char *iface) -+{ -+ toi_cluster_ops.enabled = (*iface && -+ strcmp(iface, "off")); -+ -+ if (toi_cluster_ops.enabled) -+ strncpy(toi_cluster_iface, iface, strlen(iface)); -+} -+ -+__setup("toi_cluster=", toi_cluster_iface_setup); -+ -+#ifdef MODULE -+MODULE_LICENSE("GPL"); -+module_init(toi_cluster_init); -+module_exit(toi_cluster_exit); -+MODULE_AUTHOR("Nigel Cunningham"); -+MODULE_DESCRIPTION("Cluster Support for TuxOnIce"); -+#endif -diff --git a/kernel/power/tuxonice_cluster.h b/kernel/power/tuxonice_cluster.h -new file mode 100644 -index 0000000..051feb3 ---- /dev/null -+++ b/kernel/power/tuxonice_cluster.h -@@ -0,0 +1,18 @@ -+/* -+ * kernel/power/tuxonice_cluster.h -+ * -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ */ -+ -+#ifdef CONFIG_TOI_CLUSTER -+extern int toi_cluster_init(void); -+extern void toi_cluster_exit(void); -+extern void toi_initiate_cluster_hibernate(void); -+#else -+static inline int toi_cluster_init(void) { return 0; } -+static inline void toi_cluster_exit(void) { } -+static inline void toi_initiate_cluster_hibernate(void) { } -+#endif -+ -diff --git a/kernel/power/tuxonice_compress.c b/kernel/power/tuxonice_compress.c -new file mode 100644 -index 0000000..6bbc446 ---- /dev/null -+++ b/kernel/power/tuxonice_compress.c -@@ -0,0 +1,497 @@ -+/* -+ * kernel/power/compression.c -+ * -+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * This file contains data compression routines for TuxOnIce, -+ * using cryptoapi. -+ */ -+ -+#include -+#include -+#include -+#include -+ -+#include "tuxonice_builtin.h" -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_io.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_alloc.h" -+ -+static int toi_expected_compression; -+ -+static struct toi_module_ops toi_compression_ops; -+static struct toi_module_ops *next_driver; -+ -+static char toi_compressor_name[32] = "lzo"; -+ -+static DEFINE_MUTEX(stats_lock); -+ -+struct cpu_context { -+ u8 *page_buffer; -+ struct crypto_comp *transform; -+ unsigned int len; -+ char *buffer_start; -+ char *output_buffer; -+ char *check_buffer; -+}; -+ -+static DEFINE_PER_CPU(struct cpu_context, contexts); -+static int toi_check_compression; -+ -+/* -+ * toi_crypto_prepare -+ * -+ * Prepare to do some work by allocating buffers and transforms. -+ */ -+static int toi_compress_crypto_prepare(void) -+{ -+ int cpu; -+ -+ if (!*toi_compressor_name) { -+ printk(KERN_INFO "TuxOnIce: Compression enabled but no " -+ "compressor name set.\n"); -+ return 1; -+ } -+ -+ for_each_online_cpu(cpu) { -+ struct cpu_context *this = &per_cpu(contexts, cpu); -+ this->transform = crypto_alloc_comp(toi_compressor_name, 0, 0); -+ if (IS_ERR(this->transform)) { -+ printk(KERN_INFO "TuxOnIce: Failed to initialise the " -+ "%s compression transform.\n", -+ toi_compressor_name); -+ this->transform = NULL; -+ return 1; -+ } -+ -+ this->page_buffer = -+ (char *) toi_get_zeroed_page(16, TOI_ATOMIC_GFP); -+ -+ if (!this->page_buffer) { -+ printk(KERN_ERR -+ "Failed to allocate a page buffer for TuxOnIce " -+ "compression driver.\n"); -+ return -ENOMEM; -+ } -+ -+ this->output_buffer = -+ (char *) vmalloc_32(2 * PAGE_SIZE); -+ -+ if (!this->output_buffer) { -+ printk(KERN_ERR -+ "Failed to allocate a output buffer for TuxOnIce " -+ "compression driver.\n"); -+ return -ENOMEM; -+ } -+ -+ this->check_buffer = -+ (char *) toi_get_zeroed_page(16, TOI_ATOMIC_GFP); -+ -+ if (!this->check_buffer) { -+ printk(KERN_ERR -+ "Failed to allocate a check buffer for TuxOnIce " -+ "compression driver.\n"); -+ return -ENOMEM; -+ } -+ -+ } -+ -+ return 0; -+} -+ -+static int toi_compress_rw_cleanup(int writing) -+{ -+ int cpu; -+ -+ for_each_online_cpu(cpu) { -+ struct cpu_context *this = &per_cpu(contexts, cpu); -+ if (this->transform) { -+ crypto_free_comp(this->transform); -+ this->transform = NULL; -+ } -+ -+ if (this->page_buffer) -+ toi_free_page(16, (unsigned long) this->page_buffer); -+ -+ this->page_buffer = NULL; -+ -+ if (this->output_buffer) -+ vfree(this->output_buffer); -+ -+ this->output_buffer = NULL; -+ -+ if (this->check_buffer) -+ toi_free_page(16, (unsigned long) this->check_buffer); -+ -+ this->check_buffer = NULL; -+ } -+ -+ return 0; -+} -+ -+/* -+ * toi_compress_init -+ */ -+ -+static int toi_compress_init(int toi_or_resume) -+{ -+ if (!toi_or_resume) -+ return 0; -+ -+ toi_compress_bytes_in = 0; -+ toi_compress_bytes_out = 0; -+ -+ next_driver = toi_get_next_filter(&toi_compression_ops); -+ -+ return next_driver ? 0 : -ECHILD; -+} -+ -+/* -+ * toi_compress_rw_init() -+ */ -+ -+static int toi_compress_rw_init(int rw, int stream_number) -+{ -+ if (toi_compress_crypto_prepare()) { -+ printk(KERN_ERR "Failed to initialise compression " -+ "algorithm.\n"); -+ if (rw == READ) { -+ printk(KERN_INFO "Unable to read the image.\n"); -+ return -ENODEV; -+ } else { -+ printk(KERN_INFO "Continuing without " -+ "compressing the image.\n"); -+ toi_compression_ops.enabled = 0; -+ } -+ } -+ -+ return 0; -+} -+ -+static int check_compression(struct cpu_context *ctx, struct page *buffer_page, -+ int buf_size) -+{ -+ char *original = kmap(buffer_page); -+ int output_size = PAGE_SIZE, okay, ret; -+ -+ ret = crypto_comp_decompress(ctx->transform, ctx->output_buffer, -+ ctx->len, ctx->check_buffer, &output_size); -+ okay = (!ret && output_size == PAGE_SIZE && -+ !memcmp(ctx->check_buffer, original, PAGE_SIZE)); -+ -+ if (!okay) { -+ printk("Compression test failed.\n"); -+ print_hex_dump(KERN_ERR, "Original page: ", DUMP_PREFIX_NONE, -+ 16, 1, original, PAGE_SIZE, 0); -+ printk(KERN_ERR "\nOutput %d bytes. Result %d.", ctx->len, ret); -+ print_hex_dump(KERN_ERR, "Compressed to: ", DUMP_PREFIX_NONE, -+ 16, 1, ctx->output_buffer, ctx->len, 0); -+ printk(KERN_ERR "\nRestored to %d bytes.\n", output_size); -+ print_hex_dump(KERN_ERR, "Decompressed : ", DUMP_PREFIX_NONE, -+ 16, 1, ctx->check_buffer, output_size, 0); -+ } -+ kunmap(buffer_page); -+ -+ return okay; -+} -+ -+/* -+ * toi_compress_write_page() -+ * -+ * Compress a page of data, buffering output and passing on filled -+ * pages to the next module in the pipeline. -+ * -+ * Buffer_page: Pointer to a buffer of size PAGE_SIZE, containing -+ * data to be compressed. -+ * -+ * Returns: 0 on success. Otherwise the error is that returned by later -+ * modules, -ECHILD if we have a broken pipeline or -EIO if -+ * zlib errs. -+ */ -+static int toi_compress_write_page(unsigned long index, -+ struct page *buffer_page, unsigned int buf_size) -+{ -+ int ret, cpu = smp_processor_id(); -+ struct cpu_context *ctx = &per_cpu(contexts, cpu); -+ -+ if (!ctx->transform) -+ return next_driver->write_page(index, buffer_page, buf_size); -+ -+ ctx->buffer_start = kmap(buffer_page); -+ -+ ctx->len = PAGE_SIZE; -+ -+ ret = crypto_comp_compress(ctx->transform, -+ ctx->buffer_start, buf_size, -+ ctx->output_buffer, &ctx->len); -+ -+ kunmap(buffer_page); -+ -+ mutex_lock(&stats_lock); -+ toi_compress_bytes_in += buf_size; -+ toi_compress_bytes_out += ctx->len; -+ mutex_unlock(&stats_lock); -+ -+ if (!ret && ctx->len < buf_size) { /* some compression */ -+ if (unlikely(toi_check_compression)) { -+ ret = check_compression(ctx, buffer_page, buf_size); -+ if (!ret) -+ return next_driver->write_page(index, -+ buffer_page, buf_size); -+ } -+ -+ memcpy(ctx->page_buffer, ctx->output_buffer, ctx->len); -+ return next_driver->write_page(index, -+ virt_to_page(ctx->page_buffer), -+ ctx->len); -+ } else -+ return next_driver->write_page(index, buffer_page, buf_size); -+} -+ -+/* -+ * toi_compress_read_page() -+ * @buffer_page: struct page *. Pointer to a buffer of size PAGE_SIZE. -+ * -+ * Retrieve data from later modules and decompress it until the input buffer -+ * is filled. -+ * Zero if successful. Error condition from me or from downstream on failure. -+ */ -+static int toi_compress_read_page(unsigned long *index, -+ struct page *buffer_page, unsigned int *buf_size) -+{ -+ int ret, cpu = smp_processor_id(); -+ unsigned int len; -+ unsigned int outlen = PAGE_SIZE; -+ char *buffer_start; -+ struct cpu_context *ctx = &per_cpu(contexts, cpu); -+ -+ if (!ctx->transform) -+ return next_driver->read_page(index, buffer_page, buf_size); -+ -+ /* -+ * All our reads must be synchronous - we can't decompress -+ * data that hasn't been read yet. -+ */ -+ -+ ret = next_driver->read_page(index, buffer_page, &len); -+ -+ /* Error or uncompressed data */ -+ if (ret || len == PAGE_SIZE) -+ return ret; -+ -+ buffer_start = kmap(buffer_page); -+ memcpy(ctx->page_buffer, buffer_start, len); -+ ret = crypto_comp_decompress( -+ ctx->transform, -+ ctx->page_buffer, -+ len, buffer_start, &outlen); -+ if (ret) -+ abort_hibernate(TOI_FAILED_IO, -+ "Compress_read returned %d.\n", ret); -+ else if (outlen != PAGE_SIZE) { -+ abort_hibernate(TOI_FAILED_IO, -+ "Decompression yielded %d bytes instead of %ld.\n", -+ outlen, PAGE_SIZE); -+ printk(KERN_ERR "Decompression yielded %d bytes instead of " -+ "%ld.\n", outlen, PAGE_SIZE); -+ ret = -EIO; -+ *buf_size = outlen; -+ } -+ kunmap(buffer_page); -+ return ret; -+} -+ -+/* -+ * toi_compress_print_debug_stats -+ * @buffer: Pointer to a buffer into which the debug info will be printed. -+ * @size: Size of the buffer. -+ * -+ * Print information to be recorded for debugging purposes into a buffer. -+ * Returns: Number of characters written to the buffer. -+ */ -+ -+static int toi_compress_print_debug_stats(char *buffer, int size) -+{ -+ unsigned long pages_in = toi_compress_bytes_in >> PAGE_SHIFT, -+ pages_out = toi_compress_bytes_out >> PAGE_SHIFT; -+ int len; -+ -+ /* Output the compression ratio achieved. */ -+ if (*toi_compressor_name) -+ len = scnprintf(buffer, size, "- Compressor is '%s'.\n", -+ toi_compressor_name); -+ else -+ len = scnprintf(buffer, size, "- Compressor is not set.\n"); -+ -+ if (pages_in) -+ len += scnprintf(buffer+len, size - len, " Compressed " -+ "%lu bytes into %lu (%ld percent compression).\n", -+ toi_compress_bytes_in, -+ toi_compress_bytes_out, -+ (pages_in - pages_out) * 100 / pages_in); -+ return len; -+} -+ -+/* -+ * toi_compress_compression_memory_needed -+ * -+ * Tell the caller how much memory we need to operate during hibernate/resume. -+ * Returns: Unsigned long. Maximum number of bytes of memory required for -+ * operation. -+ */ -+static int toi_compress_memory_needed(void) -+{ -+ return 2 * PAGE_SIZE; -+} -+ -+static int toi_compress_storage_needed(void) -+{ -+ return 4 * sizeof(unsigned long) + strlen(toi_compressor_name) + 1; -+} -+ -+/* -+ * toi_compress_save_config_info -+ * @buffer: Pointer to a buffer of size PAGE_SIZE. -+ * -+ * Save informaton needed when reloading the image at resume time. -+ * Returns: Number of bytes used for saving our data. -+ */ -+static int toi_compress_save_config_info(char *buffer) -+{ -+ int namelen = strlen(toi_compressor_name) + 1; -+ int total_len; -+ -+ *((unsigned long *) buffer) = toi_compress_bytes_in; -+ *((unsigned long *) (buffer + 1 * sizeof(unsigned long))) = -+ toi_compress_bytes_out; -+ *((unsigned long *) (buffer + 2 * sizeof(unsigned long))) = -+ toi_expected_compression; -+ *((unsigned long *) (buffer + 3 * sizeof(unsigned long))) = namelen; -+ strncpy(buffer + 4 * sizeof(unsigned long), toi_compressor_name, -+ namelen); -+ total_len = 4 * sizeof(unsigned long) + namelen; -+ return total_len; -+} -+ -+/* toi_compress_load_config_info -+ * @buffer: Pointer to the start of the data. -+ * @size: Number of bytes that were saved. -+ * -+ * Description: Reload information needed for decompressing the image at -+ * resume time. -+ */ -+static void toi_compress_load_config_info(char *buffer, int size) -+{ -+ int namelen; -+ -+ toi_compress_bytes_in = *((unsigned long *) buffer); -+ toi_compress_bytes_out = *((unsigned long *) (buffer + 1 * -+ sizeof(unsigned long))); -+ toi_expected_compression = *((unsigned long *) (buffer + 2 * -+ sizeof(unsigned long))); -+ namelen = *((unsigned long *) (buffer + 3 * sizeof(unsigned long))); -+ if (strncmp(toi_compressor_name, buffer + 4 * sizeof(unsigned long), -+ namelen)) -+ strncpy(toi_compressor_name, buffer + 4 * sizeof(unsigned long), -+ namelen); -+ return; -+} -+ -+static void toi_compress_pre_atomic_restore(struct toi_boot_kernel_data *bkd) -+{ -+ bkd->compress_bytes_in = toi_compress_bytes_in; -+ bkd->compress_bytes_out = toi_compress_bytes_out; -+} -+ -+static void toi_compress_post_atomic_restore(struct toi_boot_kernel_data *bkd) -+{ -+ toi_compress_bytes_in = bkd->compress_bytes_in; -+ toi_compress_bytes_out = bkd->compress_bytes_out; -+} -+ -+/* -+ * toi_expected_compression_ratio -+ * -+ * Description: Returns the expected ratio between data passed into this module -+ * and the amount of data output when writing. -+ * Returns: 100 if the module is disabled. Otherwise the value set by the -+ * user via our sysfs entry. -+ */ -+ -+static int toi_compress_expected_ratio(void) -+{ -+ if (!toi_compression_ops.enabled) -+ return 100; -+ else -+ return 100 - toi_expected_compression; -+} -+ -+/* -+ * data for our sysfs entries. -+ */ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_INT("expected_compression", SYSFS_RW, &toi_expected_compression, -+ 0, 99, 0, NULL), -+ SYSFS_INT("enabled", SYSFS_RW, &toi_compression_ops.enabled, 0, 1, 0, -+ NULL), -+ SYSFS_INT("check", SYSFS_RW, &toi_check_compression, 0, 1, 0, -+ NULL), -+ SYSFS_STRING("algorithm", SYSFS_RW, toi_compressor_name, 31, 0, NULL), -+}; -+ -+/* -+ * Ops structure. -+ */ -+static struct toi_module_ops toi_compression_ops = { -+ .type = FILTER_MODULE, -+ .name = "compression", -+ .directory = "compression", -+ .module = THIS_MODULE, -+ .initialise = toi_compress_init, -+ .memory_needed = toi_compress_memory_needed, -+ .print_debug_info = toi_compress_print_debug_stats, -+ .save_config_info = toi_compress_save_config_info, -+ .load_config_info = toi_compress_load_config_info, -+ .storage_needed = toi_compress_storage_needed, -+ .expected_compression = toi_compress_expected_ratio, -+ -+ .pre_atomic_restore = toi_compress_pre_atomic_restore, -+ .post_atomic_restore = toi_compress_post_atomic_restore, -+ -+ .rw_init = toi_compress_rw_init, -+ .rw_cleanup = toi_compress_rw_cleanup, -+ -+ .write_page = toi_compress_write_page, -+ .read_page = toi_compress_read_page, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+/* ---- Registration ---- */ -+ -+static __init int toi_compress_load(void) -+{ -+ return toi_register_module(&toi_compression_ops); -+} -+ -+#ifdef MODULE -+static __exit void toi_compress_unload(void) -+{ -+ toi_unregister_module(&toi_compression_ops); -+} -+ -+module_init(toi_compress_load); -+module_exit(toi_compress_unload); -+MODULE_LICENSE("GPL"); -+MODULE_AUTHOR("Nigel Cunningham"); -+MODULE_DESCRIPTION("Compression Support for TuxOnIce"); -+#else -+late_initcall(toi_compress_load); -+#endif -diff --git a/kernel/power/tuxonice_extent.c b/kernel/power/tuxonice_extent.c -new file mode 100644 -index 0000000..e84572c ---- /dev/null -+++ b/kernel/power/tuxonice_extent.c -@@ -0,0 +1,123 @@ -+/* -+ * kernel/power/tuxonice_extent.c -+ * -+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * These functions encapsulate the manipulation of storage metadata. -+ */ -+ -+#include -+#include "tuxonice_modules.h" -+#include "tuxonice_extent.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_ui.h" -+#include "tuxonice.h" -+ -+/** -+ * toi_get_extent - return a free extent -+ * -+ * May fail, returning NULL instead. -+ **/ -+static struct hibernate_extent *toi_get_extent(void) -+{ -+ return (struct hibernate_extent *) toi_kzalloc(2, -+ sizeof(struct hibernate_extent), TOI_ATOMIC_GFP); -+} -+ -+/** -+ * toi_put_extent_chain - free a whole chain of extents -+ * @chain: Chain to free. -+ **/ -+void toi_put_extent_chain(struct hibernate_extent_chain *chain) -+{ -+ struct hibernate_extent *this; -+ -+ this = chain->first; -+ -+ while (this) { -+ struct hibernate_extent *next = this->next; -+ toi_kfree(2, this, sizeof(*this)); -+ chain->num_extents--; -+ this = next; -+ } -+ -+ chain->first = NULL; -+ chain->last_touched = NULL; -+ chain->current_extent = NULL; -+ chain->size = 0; -+} -+EXPORT_SYMBOL_GPL(toi_put_extent_chain); -+ -+/** -+ * toi_add_to_extent_chain - add an extent to an existing chain -+ * @chain: Chain to which the extend should be added -+ * @start: Start of the extent (first physical block) -+ * @end: End of the extent (last physical block) -+ * -+ * The chain information is updated if the insertion is successful. -+ **/ -+int toi_add_to_extent_chain(struct hibernate_extent_chain *chain, -+ unsigned long start, unsigned long end) -+{ -+ struct hibernate_extent *new_ext = NULL, *cur_ext = NULL; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "Adding extent %lu-%lu to chain %p.\n", start, end, chain); -+ -+ /* Find the right place in the chain */ -+ if (chain->last_touched && chain->last_touched->start < start) -+ cur_ext = chain->last_touched; -+ else if (chain->first && chain->first->start < start) -+ cur_ext = chain->first; -+ -+ if (cur_ext) { -+ while (cur_ext->next && cur_ext->next->start < start) -+ cur_ext = cur_ext->next; -+ -+ if (cur_ext->end == (start - 1)) { -+ struct hibernate_extent *next_ext = cur_ext->next; -+ cur_ext->end = end; -+ -+ /* Merge with the following one? */ -+ if (next_ext && cur_ext->end + 1 == next_ext->start) { -+ cur_ext->end = next_ext->end; -+ cur_ext->next = next_ext->next; -+ toi_kfree(2, next_ext, sizeof(*next_ext)); -+ chain->num_extents--; -+ } -+ -+ chain->last_touched = cur_ext; -+ chain->size += (end - start + 1); -+ -+ return 0; -+ } -+ } -+ -+ new_ext = toi_get_extent(); -+ if (!new_ext) { -+ printk(KERN_INFO "Error unable to append a new extent to the " -+ "chain.\n"); -+ return -ENOMEM; -+ } -+ -+ chain->num_extents++; -+ chain->size += (end - start + 1); -+ new_ext->start = start; -+ new_ext->end = end; -+ -+ chain->last_touched = new_ext; -+ -+ if (cur_ext) { -+ new_ext->next = cur_ext->next; -+ cur_ext->next = new_ext; -+ } else { -+ if (chain->first) -+ new_ext->next = chain->first; -+ chain->first = new_ext; -+ } -+ -+ return 0; -+} -+EXPORT_SYMBOL_GPL(toi_add_to_extent_chain); -diff --git a/kernel/power/tuxonice_extent.h b/kernel/power/tuxonice_extent.h -new file mode 100644 -index 0000000..157446c ---- /dev/null -+++ b/kernel/power/tuxonice_extent.h -@@ -0,0 +1,44 @@ -+/* -+ * kernel/power/tuxonice_extent.h -+ * -+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * It contains declarations related to extents. Extents are -+ * TuxOnIce's method of storing some of the metadata for the image. -+ * See tuxonice_extent.c for more info. -+ * -+ */ -+ -+#include "tuxonice_modules.h" -+ -+#ifndef EXTENT_H -+#define EXTENT_H -+ -+struct hibernate_extent { -+ unsigned long start, end; -+ struct hibernate_extent *next; -+}; -+ -+struct hibernate_extent_chain { -+ unsigned long size; /* size of the chain ie sum (max-min+1) */ -+ int num_extents; -+ struct hibernate_extent *first, *last_touched; -+ struct hibernate_extent *current_extent; -+ unsigned long current_offset; -+}; -+ -+/* Simplify iterating through all the values in an extent chain */ -+#define toi_extent_for_each(extent_chain, extentpointer, value) \ -+if ((extent_chain)->first) \ -+ for ((extentpointer) = (extent_chain)->first, (value) = \ -+ (extentpointer)->start; \ -+ ((extentpointer) && ((extentpointer)->next || (value) <= \ -+ (extentpointer)->end)); \ -+ (((value) == (extentpointer)->end) ? \ -+ ((extentpointer) = (extentpointer)->next, (value) = \ -+ ((extentpointer) ? (extentpointer)->start : 0)) : \ -+ (value)++)) -+ -+#endif -diff --git a/kernel/power/tuxonice_file.c b/kernel/power/tuxonice_file.c -new file mode 100644 -index 0000000..39f2aea ---- /dev/null -+++ b/kernel/power/tuxonice_file.c -@@ -0,0 +1,496 @@ -+/* -+ * kernel/power/tuxonice_file.c -+ * -+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * This file encapsulates functions for usage of a simple file as a -+ * backing store. It is based upon the swapallocator, and shares the -+ * same basic working. Here, though, we have nothing to do with -+ * swapspace, and only one device to worry about. -+ * -+ * The user can just -+ * -+ * echo TuxOnIce > /path/to/my_file -+ * -+ * dd if=/dev/zero bs=1M count= >> /path/to/my_file -+ * -+ * and -+ * -+ * echo /path/to/my_file > /sys/power/tuxonice/file/target -+ * -+ * then put what they find in /sys/power/tuxonice/resume -+ * as their resume= parameter in lilo.conf (and rerun lilo if using it). -+ * -+ * Having done this, they're ready to hibernate and resume. -+ * -+ * TODO: -+ * - File resizing. -+ */ -+ -+#include -+#include -+#include -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_bio.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_io.h" -+ -+#define target_is_normal_file() (S_ISREG(target_inode->i_mode)) -+ -+static struct toi_module_ops toi_fileops; -+ -+static struct file *target_file; -+static struct block_device *toi_file_target_bdev; -+static unsigned long pages_available, pages_allocated; -+static char toi_file_target[256]; -+static struct inode *target_inode; -+static int file_target_priority; -+static int used_devt; -+static int target_claim; -+static dev_t toi_file_dev_t; -+static int sig_page_index; -+ -+/* For test_toi_file_target */ -+static struct toi_bdev_info *file_chain; -+ -+static int has_contiguous_blocks(struct toi_bdev_info *dev_info, int page_num) -+{ -+ int j; -+ sector_t last = 0; -+ -+ for (j = 0; j < dev_info->blocks_per_page; j++) { -+ sector_t this = bmap(target_inode, -+ page_num * dev_info->blocks_per_page + j); -+ -+ if (!this || (last && (last + 1) != this)) -+ break; -+ -+ last = this; -+ } -+ -+ return j == dev_info->blocks_per_page; -+} -+ -+static unsigned long get_usable_pages(struct toi_bdev_info *dev_info) -+{ -+ unsigned long result = 0; -+ struct block_device *bdev = dev_info->bdev; -+ int i; -+ -+ switch (target_inode->i_mode & S_IFMT) { -+ case S_IFSOCK: -+ case S_IFCHR: -+ case S_IFIFO: /* Socket, Char, Fifo */ -+ return -1; -+ case S_IFREG: /* Regular file: current size - holes + free -+ space on part */ -+ for (i = 0; i < (target_inode->i_size >> PAGE_SHIFT) ; i++) { -+ if (has_contiguous_blocks(dev_info, i)) -+ result++; -+ } -+ break; -+ case S_IFBLK: /* Block device */ -+ if (!bdev->bd_disk) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, -+ "bdev->bd_disk null."); -+ return 0; -+ } -+ -+ result = (bdev->bd_part ? -+ bdev->bd_part->nr_sects : -+ get_capacity(bdev->bd_disk)) >> (PAGE_SHIFT - 9); -+ } -+ -+ -+ return result; -+} -+ -+static int toi_file_register_storage(void) -+{ -+ struct toi_bdev_info *devinfo; -+ int result = 0; -+ struct fs_info *fs_info; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_file_register_storage."); -+ if (!strlen(toi_file_target)) { -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Register file storage: " -+ "No target filename set."); -+ return 0; -+ } -+ -+ target_file = filp_open(toi_file_target, O_RDONLY|O_LARGEFILE, 0); -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "filp_open %s returned %p.", -+ toi_file_target, target_file); -+ -+ if (IS_ERR(target_file) || !target_file) { -+ target_file = NULL; -+ toi_file_dev_t = name_to_dev_t(toi_file_target); -+ if (!toi_file_dev_t) { -+ struct kstat stat; -+ int error = vfs_stat(toi_file_target, &stat); -+ printk(KERN_INFO "Open file %s returned %p and " -+ "name_to_devt failed.\n", -+ toi_file_target, target_file); -+ if (error) { -+ printk(KERN_INFO "Stating the file also failed." -+ " Nothing more we can do.\n"); -+ return 0; -+ } else -+ toi_file_dev_t = stat.rdev; -+ } -+ -+ toi_file_target_bdev = toi_open_by_devnum(toi_file_dev_t); -+ if (IS_ERR(toi_file_target_bdev)) { -+ printk(KERN_INFO "Got a dev_num (%lx) but failed to " -+ "open it.\n", -+ (unsigned long) toi_file_dev_t); -+ toi_file_target_bdev = NULL; -+ return 0; -+ } -+ used_devt = 1; -+ target_inode = toi_file_target_bdev->bd_inode; -+ } else -+ target_inode = target_file->f_mapping->host; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Succeeded in opening the target."); -+ if (S_ISLNK(target_inode->i_mode) || S_ISDIR(target_inode->i_mode) || -+ S_ISSOCK(target_inode->i_mode) || S_ISFIFO(target_inode->i_mode)) { -+ printk(KERN_INFO "File support works with regular files," -+ " character files and block devices.\n"); -+ /* Cleanup routine will undo the above */ -+ return 0; -+ } -+ -+ if (!used_devt) { -+ if (S_ISBLK(target_inode->i_mode)) { -+ toi_file_target_bdev = I_BDEV(target_inode); -+ if (!bd_claim(toi_file_target_bdev, &toi_fileops)) -+ target_claim = 1; -+ } else -+ toi_file_target_bdev = target_inode->i_sb->s_bdev; -+ if (!toi_file_target_bdev) { -+ printk(KERN_INFO "%s is not a valid file allocator " -+ "target.\n", toi_file_target); -+ return 0; -+ } -+ toi_file_dev_t = toi_file_target_bdev->bd_dev; -+ } -+ -+ devinfo = toi_kzalloc(39, sizeof(struct toi_bdev_info), GFP_ATOMIC); -+ if (!devinfo) { -+ printk("Failed to allocate a toi_bdev_info struct for the file allocator.\n"); -+ return -ENOMEM; -+ } -+ -+ devinfo->bdev = toi_file_target_bdev; -+ devinfo->allocator = &toi_fileops; -+ devinfo->allocator_index = 0; -+ -+ fs_info = fs_info_from_block_dev(toi_file_target_bdev); -+ if (fs_info && !IS_ERR(fs_info)) { -+ memcpy(devinfo->uuid, &fs_info->uuid, 16); -+ free_fs_info(fs_info); -+ } else -+ result = (int) PTR_ERR(fs_info); -+ -+ /* Unlike swap code, only complain if fs_info_from_block_dev returned -+ * -ENOMEM. The 'file' might be a full partition, so might validly not -+ * have an identifiable type, UUID etc. -+ */ -+ if (result) -+ printk(KERN_DEBUG "Failed to get fs_info for file device (%d).\n", -+ result); -+ devinfo->dev_t = toi_file_dev_t; -+ devinfo->prio = file_target_priority; -+ devinfo->bmap_shift = target_inode->i_blkbits - 9; -+ devinfo->blocks_per_page = -+ (1 << (PAGE_SHIFT - target_inode->i_blkbits)); -+ sprintf(devinfo->name, "file %s", toi_file_target); -+ file_chain = devinfo; -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Dev_t is %lx. Prio is %d. Bmap " -+ "shift is %d. Blocks per page %d.", -+ devinfo->dev_t, devinfo->prio, devinfo->bmap_shift, -+ devinfo->blocks_per_page); -+ -+ /* Keep one aside for the signature */ -+ pages_available = get_usable_pages(devinfo) - 1; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Registering file storage, %lu " -+ "pages.", pages_available); -+ -+ toi_bio_ops.register_storage(devinfo); -+ return 0; -+} -+ -+static unsigned long toi_file_storage_available(void) -+{ -+ return pages_available; -+} -+ -+static int toi_file_allocate_storage(struct toi_bdev_info *chain, -+ unsigned long request) -+{ -+ unsigned long available = pages_available - pages_allocated; -+ unsigned long to_add = min(available, request); -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Pages available is %lu. Allocated " -+ "is %lu. Allocating %lu pages from file.", -+ pages_available, pages_allocated, to_add); -+ pages_allocated += to_add; -+ -+ return to_add; -+} -+ -+/** -+ * __populate_block_list - add an extent to the chain -+ * @min: Start of the extent (first physical block = sector) -+ * @max: End of the extent (last physical block = sector) -+ * -+ * If TOI_TEST_BIO is set, print a debug message, outputting the min and max -+ * fs block numbers. -+ **/ -+static int __populate_block_list(struct toi_bdev_info *chain, int min, int max) -+{ -+ if (test_action_state(TOI_TEST_BIO)) -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Adding extent %d-%d.", -+ min << chain->bmap_shift, -+ ((max + 1) << chain->bmap_shift) - 1); -+ -+ return toi_add_to_extent_chain(&chain->blocks, min, max); -+} -+ -+static int get_main_pool_phys_params(struct toi_bdev_info *chain) -+{ -+ int i, extent_min = -1, extent_max = -1, result = 0, have_sig_page = 0; -+ unsigned long pages_mapped = 0; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Getting file allocator blocks."); -+ -+ if (chain->blocks.first) -+ toi_put_extent_chain(&chain->blocks); -+ -+ if (!target_is_normal_file()) { -+ result = (pages_available > 0) ? -+ __populate_block_list(chain, chain->blocks_per_page, -+ (pages_allocated + 1) * -+ chain->blocks_per_page - 1) : 0; -+ return result; -+ } -+ -+ /* -+ * FIXME: We are assuming the first page is contiguous. Is that -+ * assumption always right? -+ */ -+ -+ for (i = 0; i < (target_inode->i_size >> PAGE_SHIFT); i++) { -+ sector_t new_sector; -+ -+ if (!has_contiguous_blocks(chain, i)) -+ continue; -+ -+ if (!have_sig_page) { -+ have_sig_page = 1; -+ sig_page_index = i; -+ continue; -+ } -+ -+ pages_mapped++; -+ -+ /* Ignore first page - it has the header */ -+ if (pages_mapped == 1) -+ continue; -+ -+ new_sector = bmap(target_inode, (i * chain->blocks_per_page)); -+ -+ /* -+ * I'd love to be able to fill in holes and resize -+ * files, but not yet... -+ */ -+ -+ if (new_sector == extent_max + 1) -+ extent_max += chain->blocks_per_page; -+ else { -+ if (extent_min > -1) { -+ result = __populate_block_list(chain, -+ extent_min, extent_max); -+ if (result) -+ return result; -+ } -+ -+ extent_min = new_sector; -+ extent_max = extent_min + -+ chain->blocks_per_page - 1; -+ } -+ -+ if (pages_mapped == pages_allocated) -+ break; -+ } -+ -+ if (extent_min > -1) { -+ result = __populate_block_list(chain, extent_min, extent_max); -+ if (result) -+ return result; -+ } -+ -+ return 0; -+} -+ -+static void toi_file_free_storage(struct toi_bdev_info *chain) -+{ -+ pages_allocated = 0; -+ file_chain = NULL; -+} -+ -+/** -+ * toi_file_print_debug_stats - print debug info -+ * @buffer: Buffer to data to populate -+ * @size: Size of the buffer -+ **/ -+static int toi_file_print_debug_stats(char *buffer, int size) -+{ -+ int len = scnprintf(buffer, size, "- File Allocator active.\n"); -+ -+ len += scnprintf(buffer+len, size-len, " Storage available for " -+ "image: %lu pages.\n", pages_available); -+ -+ return len; -+} -+ -+static void toi_file_cleanup(int finishing_cycle) -+{ -+ if (toi_file_target_bdev) { -+ if (target_claim) { -+ bd_release(toi_file_target_bdev); -+ target_claim = 0; -+ } -+ -+ if (used_devt) { -+ blkdev_put(toi_file_target_bdev, -+ FMODE_READ | FMODE_NDELAY); -+ used_devt = 0; -+ } -+ toi_file_target_bdev = NULL; -+ target_inode = NULL; -+ } -+ -+ if (target_file) { -+ filp_close(target_file, NULL); -+ target_file = NULL; -+ } -+ -+ pages_available = 0; -+} -+ -+/** -+ * test_toi_file_target - sysfs callback for /sys/power/tuxonince/file/target -+ * -+ * Test wheter the target file is valid for hibernating. -+ **/ -+static void test_toi_file_target(void) -+{ -+ int result = toi_file_register_storage(); -+ sector_t sector; -+ char buf[33]; -+ struct fs_info *fs_info; -+ -+ if (result || !file_chain) -+ return; -+ -+ /* This doesn't mean we're in business. Is any storage available? */ -+ if (!pages_available) -+ goto out; -+ -+ toi_file_allocate_storage(file_chain, 1); -+ result = get_main_pool_phys_params(file_chain); -+ if (result) -+ goto out; -+ -+ -+ sector = bmap(target_inode, sig_page_index * -+ file_chain->blocks_per_page) << file_chain->bmap_shift; -+ -+ /* Use the uuid, or the dev_t if that fails */ -+ fs_info = fs_info_from_block_dev(toi_file_target_bdev); -+ if (!fs_info || IS_ERR(fs_info)) { -+ bdevname(toi_file_target_bdev, buf); -+ sprintf(resume_file, "/dev/%s:%llu", buf, -+ (unsigned long long) sector); -+ } else { -+ int i; -+ hex_dump_to_buffer(fs_info->uuid, 16, 32, 1, buf, 50, 0); -+ -+ /* Remove the spaces */ -+ for (i = 1; i < 16; i++) { -+ buf[2 * i] = buf[3 * i]; -+ buf[2 * i + 1] = buf[3 * i + 1]; -+ } -+ buf[32] = 0; -+ sprintf(resume_file, "UUID=%s:0x%llx", buf, -+ (unsigned long long) sector); -+ free_fs_info(fs_info); -+ } -+ -+ toi_attempt_to_parse_resume_device(0); -+out: -+ toi_file_free_storage(file_chain); -+ toi_bio_ops.free_storage(); -+} -+ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_STRING("target", SYSFS_RW, toi_file_target, 256, -+ SYSFS_NEEDS_SM_FOR_WRITE, test_toi_file_target), -+ SYSFS_INT("enabled", SYSFS_RW, &toi_fileops.enabled, 0, 1, 0, NULL), -+ SYSFS_INT("priority", SYSFS_RW, &file_target_priority, -4095, -+ 4096, 0, NULL), -+}; -+ -+static struct toi_bio_allocator_ops toi_bio_fileops = { -+ .register_storage = toi_file_register_storage, -+ .storage_available = toi_file_storage_available, -+ .allocate_storage = toi_file_allocate_storage, -+ .bmap = get_main_pool_phys_params, -+ .free_storage = toi_file_free_storage, -+}; -+ -+static struct toi_module_ops toi_fileops = { -+ .type = BIO_ALLOCATOR_MODULE, -+ .name = "file storage", -+ .directory = "file", -+ .module = THIS_MODULE, -+ .print_debug_info = toi_file_print_debug_stats, -+ .cleanup = toi_file_cleanup, -+ .bio_allocator_ops = &toi_bio_fileops, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+/* ---- Registration ---- */ -+static __init int toi_file_load(void) -+{ -+ return toi_register_module(&toi_fileops); -+} -+ -+#ifdef MODULE -+static __exit void toi_file_unload(void) -+{ -+ toi_unregister_module(&toi_fileops); -+} -+ -+module_init(toi_file_load); -+module_exit(toi_file_unload); -+MODULE_LICENSE("GPL"); -+MODULE_AUTHOR("Nigel Cunningham"); -+MODULE_DESCRIPTION("TuxOnIce FileAllocator"); -+#else -+late_initcall(toi_file_load); -+#endif -diff --git a/kernel/power/tuxonice_highlevel.c b/kernel/power/tuxonice_highlevel.c -new file mode 100644 -index 0000000..c4bbb49 ---- /dev/null -+++ b/kernel/power/tuxonice_highlevel.c -@@ -0,0 +1,1313 @@ -+/* -+ * kernel/power/tuxonice_highlevel.c -+ */ -+/** \mainpage TuxOnIce. -+ * -+ * TuxOnIce provides support for saving and restoring an image of -+ * system memory to an arbitrary storage device, either on the local computer, -+ * or across some network. The support is entirely OS based, so TuxOnIce -+ * works without requiring BIOS, APM or ACPI support. The vast majority of the -+ * code is also architecture independant, so it should be very easy to port -+ * the code to new architectures. TuxOnIce includes support for SMP, 4G HighMem -+ * and preemption. Initramfses and initrds are also supported. -+ * -+ * TuxOnIce uses a modular design, in which the method of storing the image is -+ * completely abstracted from the core code, as are transformations on the data -+ * such as compression and/or encryption (multiple 'modules' can be used to -+ * provide arbitrary combinations of functionality). The user interface is also -+ * modular, so that arbitrarily simple or complex interfaces can be used to -+ * provide anything from debugging information through to eye candy. -+ * -+ * \section Copyright -+ * -+ * TuxOnIce is released under the GPLv2. -+ * -+ * Copyright (C) 1998-2001 Gabor Kuti
-+ * Copyright (C) 1998,2001,2002 Pavel Machek
-+ * Copyright (C) 2002-2003 Florent Chabaud
-+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net)
-+ * -+ * \section Credits -+ * -+ * Nigel would like to thank the following people for their work: -+ * -+ * Bernard Blackham
-+ * Web page & Wiki administration, some coding. A person without whom -+ * TuxOnIce would not be where it is. -+ * -+ * Michael Frank
-+ * Extensive testing and help with improving stability. I was constantly -+ * amazed by the quality and quantity of Michael's help. -+ * -+ * Pavel Machek
-+ * Modifications, defectiveness pointing, being with Gabor at the very -+ * beginning, suspend to swap space, stop all tasks. Port to 2.4.18-ac and -+ * 2.5.17. Even though Pavel and I disagree on the direction suspend to -+ * disk should take, I appreciate the valuable work he did in helping Gabor -+ * get the concept working. -+ * -+ * ..and of course the myriads of TuxOnIce users who have helped diagnose -+ * and fix bugs, made suggestions on how to improve the code, proofread -+ * documentation, and donated time and money. -+ * -+ * Thanks also to corporate sponsors: -+ * -+ * Redhat.Sometime employer from May 2006 (my fault, not Redhat's!). -+ * -+ * Cyclades.com. Nigel's employers from Dec 2004 until May 2006, who -+ * allowed him to work on TuxOnIce and PM related issues on company time. -+ * -+ * LinuxFund.org. Sponsored Nigel's work on TuxOnIce for four months Oct -+ * 2003 to Jan 2004. -+ * -+ * LAC Linux. Donated P4 hardware that enabled development and ongoing -+ * maintenance of SMP and Highmem support. -+ * -+ * OSDL. Provided access to various hardware configurations, make -+ * occasional small donations to the project. -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include /* for get/set_fs & KERNEL_DS on i386 */ -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice_io.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_power_off.h" -+#include "tuxonice_storage.h" -+#include "tuxonice_checksum.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_atomic_copy.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_cluster.h" -+ -+/*! Pageset metadata. */ -+struct pagedir pagedir2 = {2}; -+EXPORT_SYMBOL_GPL(pagedir2); -+ -+static mm_segment_t oldfs; -+static DEFINE_MUTEX(tuxonice_in_use); -+static int block_dump_save; -+ -+/* Binary signature if an image is present */ -+char tuxonice_signature[9] = "\xed\xc3\x02\xe9\x98\x56\xe5\x0c"; -+EXPORT_SYMBOL_GPL(tuxonice_signature); -+ -+unsigned long boot_kernel_data_buffer; -+ -+static char *result_strings[] = { -+ "Hibernation was aborted", -+ "The user requested that we cancel the hibernation", -+ "No storage was available", -+ "Insufficient storage was available", -+ "Freezing filesystems and/or tasks failed", -+ "A pre-existing image was used", -+ "We would free memory, but image size limit doesn't allow this", -+ "Unable to free enough memory to hibernate", -+ "Unable to obtain the Power Management Semaphore", -+ "A device suspend/resume returned an error", -+ "A system device suspend/resume returned an error", -+ "The extra pages allowance is too small", -+ "We were unable to successfully prepare an image", -+ "TuxOnIce module initialisation failed", -+ "TuxOnIce module cleanup failed", -+ "I/O errors were encountered", -+ "Ran out of memory", -+ "An error was encountered while reading the image", -+ "Platform preparation failed", -+ "CPU Hotplugging failed", -+ "Architecture specific preparation failed", -+ "Pages needed resaving, but we were told to abort if this happens", -+ "We can't hibernate at the moment (invalid resume= or filewriter " -+ "target?)", -+ "A hibernation preparation notifier chain member cancelled the " -+ "hibernation", -+ "Pre-snapshot preparation failed", -+ "Pre-restore preparation failed", -+ "Failed to disable usermode helpers", -+ "Can't resume from alternate image", -+ "Header reservation too small", -+}; -+ -+/** -+ * toi_finish_anything - cleanup after doing anything -+ * @hibernate_or_resume: Whether finishing a cycle or attempt at -+ * resuming. -+ * -+ * This is our basic clean-up routine, matching start_anything below. We -+ * call cleanup routines, drop module references and restore process fs and -+ * cpus allowed masks, together with the global block_dump variable's value. -+ **/ -+void toi_finish_anything(int hibernate_or_resume) -+{ -+ toi_cleanup_modules(hibernate_or_resume); -+ toi_put_modules(); -+ if (hibernate_or_resume) { -+ block_dump = block_dump_save; -+ set_cpus_allowed_ptr(current, cpu_all_mask); -+ toi_alloc_print_debug_stats(); -+ atomic_inc(&snapshot_device_available); -+ mutex_unlock(&pm_mutex); -+ } -+ -+ set_fs(oldfs); -+ mutex_unlock(&tuxonice_in_use); -+} -+ -+/** -+ * toi_start_anything - basic initialisation for TuxOnIce -+ * @toi_or_resume: Whether starting a cycle or attempt at resuming. -+ * -+ * Our basic initialisation routine. Take references on modules, use the -+ * kernel segment, recheck resume= if no active allocator is set, initialise -+ * modules, save and reset block_dump and ensure we're running on CPU0. -+ **/ -+int toi_start_anything(int hibernate_or_resume) -+{ -+ mutex_lock(&tuxonice_in_use); -+ -+ oldfs = get_fs(); -+ set_fs(KERNEL_DS); -+ -+ if (hibernate_or_resume) { -+ mutex_lock(&pm_mutex); -+ -+ if (!atomic_add_unless(&snapshot_device_available, -1, 0)) -+ goto snapshotdevice_unavailable; -+ } -+ -+ if (hibernate_or_resume == SYSFS_HIBERNATE) -+ toi_print_modules(); -+ -+ if (toi_get_modules()) { -+ printk(KERN_INFO "TuxOnIce: Get modules failed!\n"); -+ goto prehibernate_err; -+ } -+ -+ if (hibernate_or_resume) { -+ block_dump_save = block_dump; -+ block_dump = 0; -+ set_cpus_allowed_ptr(current, -+ &cpumask_of_cpu(first_cpu(cpu_online_map))); -+ } -+ -+ if (toi_initialise_modules_early(hibernate_or_resume)) -+ goto early_init_err; -+ -+ if (!toiActiveAllocator) -+ toi_attempt_to_parse_resume_device(!hibernate_or_resume); -+ -+ if (!toi_initialise_modules_late(hibernate_or_resume)) -+ return 0; -+ -+ toi_cleanup_modules(hibernate_or_resume); -+early_init_err: -+ if (hibernate_or_resume) { -+ block_dump_save = block_dump; -+ set_cpus_allowed_ptr(current, cpu_all_mask); -+ } -+ toi_put_modules(); -+prehibernate_err: -+ if (hibernate_or_resume) -+ atomic_inc(&snapshot_device_available); -+snapshotdevice_unavailable: -+ if (hibernate_or_resume) -+ mutex_unlock(&pm_mutex); -+ set_fs(oldfs); -+ mutex_unlock(&tuxonice_in_use); -+ return -EBUSY; -+} -+ -+/* -+ * Nosave page tracking. -+ * -+ * Here rather than in prepare_image because we want to do it once only at the -+ * start of a cycle. -+ */ -+ -+/** -+ * mark_nosave_pages - set up our Nosave bitmap -+ * -+ * Build a bitmap of Nosave pages from the list. The bitmap allows faster -+ * use when preparing the image. -+ **/ -+static void mark_nosave_pages(void) -+{ -+ struct nosave_region *region; -+ -+ list_for_each_entry(region, &nosave_regions, list) { -+ unsigned long pfn; -+ -+ for (pfn = region->start_pfn; pfn < region->end_pfn; pfn++) -+ if (pfn_valid(pfn)) -+ SetPageNosave(pfn_to_page(pfn)); -+ } -+} -+ -+static int alloc_a_bitmap(struct memory_bitmap **bm) -+{ -+ int result = 0; -+ -+ *bm = kzalloc(sizeof(struct memory_bitmap), GFP_KERNEL); -+ if (!*bm) { -+ printk(KERN_ERR "Failed to kzalloc memory for a bitmap.\n"); -+ return -ENOMEM; -+ } -+ -+ result = memory_bm_create(*bm, GFP_KERNEL, 0); -+ -+ if (result) { -+ printk(KERN_ERR "Failed to create a bitmap.\n"); -+ kfree(*bm); -+ } -+ -+ return result; -+} -+ -+/** -+ * allocate_bitmaps - allocate bitmaps used to record page states -+ * -+ * Allocate the bitmaps we use to record the various TuxOnIce related -+ * page states. -+ **/ -+static int allocate_bitmaps(void) -+{ -+ if (alloc_a_bitmap(&pageset1_map) || -+ alloc_a_bitmap(&pageset1_copy_map) || -+ alloc_a_bitmap(&pageset2_map) || -+ alloc_a_bitmap(&io_map) || -+ alloc_a_bitmap(&nosave_map) || -+ alloc_a_bitmap(&free_map) || -+ alloc_a_bitmap(&page_resave_map)) -+ return 1; -+ -+ return 0; -+} -+ -+static void free_a_bitmap(struct memory_bitmap **bm) -+{ -+ if (!*bm) -+ return; -+ -+ memory_bm_free(*bm, 0); -+ kfree(*bm); -+ *bm = NULL; -+} -+ -+/** -+ * free_bitmaps - free the bitmaps used to record page states -+ * -+ * Free the bitmaps allocated above. It is not an error to call -+ * memory_bm_free on a bitmap that isn't currently allocated. -+ **/ -+static void free_bitmaps(void) -+{ -+ free_a_bitmap(&pageset1_map); -+ free_a_bitmap(&pageset1_copy_map); -+ free_a_bitmap(&pageset2_map); -+ free_a_bitmap(&io_map); -+ free_a_bitmap(&nosave_map); -+ free_a_bitmap(&free_map); -+ free_a_bitmap(&page_resave_map); -+} -+ -+/** -+ * io_MB_per_second - return the number of MB/s read or written -+ * @write: Whether to return the speed at which we wrote. -+ * -+ * Calculate the number of megabytes per second that were read or written. -+ **/ -+static int io_MB_per_second(int write) -+{ -+ return (toi_bkd.toi_io_time[write][1]) ? -+ MB((unsigned long) toi_bkd.toi_io_time[write][0]) * HZ / -+ toi_bkd.toi_io_time[write][1] : 0; -+} -+ -+#define SNPRINTF(a...) do { len += scnprintf(((char *) buffer) + len, \ -+ count - len - 1, ## a); } while (0) -+ -+/** -+ * get_debug_info - fill a buffer with debugging information -+ * @buffer: The buffer to be filled. -+ * @count: The size of the buffer, in bytes. -+ * -+ * Fill a (usually PAGE_SIZEd) buffer with the debugging info that we will -+ * either printk or return via sysfs. -+ **/ -+static int get_toi_debug_info(const char *buffer, int count) -+{ -+ int len = 0, i, first_result = 1; -+ -+ SNPRINTF("TuxOnIce debugging info:\n"); -+ SNPRINTF("- TuxOnIce core : " TOI_CORE_VERSION "\n"); -+ SNPRINTF("- Kernel Version : " UTS_RELEASE "\n"); -+ SNPRINTF("- Compiler vers. : %d.%d\n", __GNUC__, __GNUC_MINOR__); -+ SNPRINTF("- Attempt number : %d\n", nr_hibernates); -+ SNPRINTF("- Parameters : %ld %ld %ld %d %ld %ld\n", -+ toi_result, -+ toi_bkd.toi_action, -+ toi_bkd.toi_debug_state, -+ toi_bkd.toi_default_console_level, -+ image_size_limit, -+ toi_poweroff_method); -+ SNPRINTF("- Overall expected compression percentage: %d.\n", -+ 100 - toi_expected_compression_ratio()); -+ len += toi_print_module_debug_info(((char *) buffer) + len, -+ count - len - 1); -+ if (toi_bkd.toi_io_time[0][1]) { -+ if ((io_MB_per_second(0) < 5) || (io_MB_per_second(1) < 5)) { -+ SNPRINTF("- I/O speed: Write %ld KB/s", -+ (KB((unsigned long) toi_bkd.toi_io_time[0][0]) * HZ / -+ toi_bkd.toi_io_time[0][1])); -+ if (toi_bkd.toi_io_time[1][1]) -+ SNPRINTF(", Read %ld KB/s", -+ (KB((unsigned long) -+ toi_bkd.toi_io_time[1][0]) * HZ / -+ toi_bkd.toi_io_time[1][1])); -+ } else { -+ SNPRINTF("- I/O speed: Write %ld MB/s", -+ (MB((unsigned long) toi_bkd.toi_io_time[0][0]) * HZ / -+ toi_bkd.toi_io_time[0][1])); -+ if (toi_bkd.toi_io_time[1][1]) -+ SNPRINTF(", Read %ld MB/s", -+ (MB((unsigned long) -+ toi_bkd.toi_io_time[1][0]) * HZ / -+ toi_bkd.toi_io_time[1][1])); -+ } -+ SNPRINTF(".\n"); -+ } else -+ SNPRINTF("- No I/O speed stats available.\n"); -+ SNPRINTF("- Extra pages : %lu used/%lu.\n", -+ extra_pd1_pages_used, extra_pd1_pages_allowance); -+ -+ for (i = 0; i < TOI_NUM_RESULT_STATES; i++) -+ if (test_result_state(i)) { -+ SNPRINTF("%s: %s.\n", first_result ? -+ "- Result " : -+ " ", -+ result_strings[i]); -+ first_result = 0; -+ } -+ if (first_result) -+ SNPRINTF("- Result : %s.\n", nr_hibernates ? -+ "Succeeded" : -+ "No hibernation attempts so far"); -+ return len; -+} -+ -+/** -+ * do_cleanup - cleanup after attempting to hibernate or resume -+ * @get_debug_info: Whether to allocate and return debugging info. -+ * -+ * Cleanup after attempting to hibernate or resume, possibly getting -+ * debugging info as we do so. -+ **/ -+static void do_cleanup(int get_debug_info, int restarting) -+{ -+ int i = 0; -+ char *buffer = NULL; -+ -+ trap_non_toi_io = 0; -+ -+ if (get_debug_info) -+ toi_prepare_status(DONT_CLEAR_BAR, "Cleaning up..."); -+ -+ free_checksum_pages(); -+ -+ if (get_debug_info) -+ buffer = (char *) toi_get_zeroed_page(20, TOI_ATOMIC_GFP); -+ -+ if (buffer) -+ i = get_toi_debug_info(buffer, PAGE_SIZE); -+ -+ toi_free_extra_pagedir_memory(); -+ -+ pagedir1.size = 0; -+ pagedir2.size = 0; -+ set_highmem_size(pagedir1, 0); -+ set_highmem_size(pagedir2, 0); -+ -+ if (boot_kernel_data_buffer) { -+ if (!test_toi_state(TOI_BOOT_KERNEL)) -+ toi_free_page(37, boot_kernel_data_buffer); -+ boot_kernel_data_buffer = 0; -+ } -+ -+ clear_toi_state(TOI_BOOT_KERNEL); -+ thaw_processes(); -+ -+ if (test_action_state(TOI_KEEP_IMAGE) && -+ !test_result_state(TOI_ABORTED)) { -+ toi_message(TOI_ANY_SECTION, TOI_LOW, 1, -+ "TuxOnIce: Not invalidating the image due " -+ "to Keep Image being enabled."); -+ set_result_state(TOI_KEPT_IMAGE); -+ } else -+ if (toiActiveAllocator) -+ toiActiveAllocator->remove_image(); -+ -+ free_bitmaps(); -+ usermodehelper_enable(); -+ -+ if (test_toi_state(TOI_NOTIFIERS_PREPARE)) { -+ pm_notifier_call_chain(PM_POST_HIBERNATION); -+ clear_toi_state(TOI_NOTIFIERS_PREPARE); -+ } -+ -+ if (buffer && i) { -+ /* Printk can only handle 1023 bytes, including -+ * its level mangling. */ -+ for (i = 0; i < 3; i++) -+ printk(KERN_ERR "%s", buffer + (1023 * i)); -+ toi_free_page(20, (unsigned long) buffer); -+ } -+ -+ if (!test_action_state(TOI_LATE_CPU_HOTPLUG)) -+ enable_nonboot_cpus(); -+ -+ if (!restarting) -+ toi_cleanup_console(); -+ -+ free_attention_list(); -+ -+ if (!restarting) -+ toi_deactivate_storage(0); -+ -+ clear_toi_state(TOI_IGNORE_LOGLEVEL); -+ clear_toi_state(TOI_TRYING_TO_RESUME); -+ clear_toi_state(TOI_NOW_RESUMING); -+} -+ -+/** -+ * check_still_keeping_image - we kept an image; check whether to reuse it. -+ * -+ * We enter this routine when we have kept an image. If the user has said they -+ * want to still keep it, all we need to do is powerdown. If powering down -+ * means hibernating to ram and the power doesn't run out, we'll return 1. -+ * If we do power off properly or the battery runs out, we'll resume via the -+ * normal paths. -+ * -+ * If the user has said they want to remove the previously kept image, we -+ * remove it, and return 0. We'll then store a new image. -+ **/ -+static int check_still_keeping_image(void) -+{ -+ if (test_action_state(TOI_KEEP_IMAGE)) { -+ printk(KERN_INFO "Image already stored: powering down " -+ "immediately."); -+ do_toi_step(STEP_HIBERNATE_POWERDOWN); -+ return 1; /* Just in case we're using S3 */ -+ } -+ -+ printk(KERN_INFO "Invalidating previous image.\n"); -+ toiActiveAllocator->remove_image(); -+ -+ return 0; -+} -+ -+/** -+ * toi_init - prepare to hibernate to disk -+ * -+ * Initialise variables & data structures, in preparation for -+ * hibernating to disk. -+ **/ -+static int toi_init(int restarting) -+{ -+ int result, i, j; -+ -+ toi_result = 0; -+ -+ printk(KERN_INFO "Initiating a hibernation cycle.\n"); -+ -+ nr_hibernates++; -+ -+ for (i = 0; i < 2; i++) -+ for (j = 0; j < 2; j++) -+ toi_bkd.toi_io_time[i][j] = 0; -+ -+ if (!test_toi_state(TOI_CAN_HIBERNATE) || -+ allocate_bitmaps()) -+ return 1; -+ -+ mark_nosave_pages(); -+ -+ if (!restarting) -+ toi_prepare_console(); -+ -+ result = pm_notifier_call_chain(PM_HIBERNATION_PREPARE); -+ if (result) { -+ set_result_state(TOI_NOTIFIERS_PREPARE_FAILED); -+ return 1; -+ } -+ set_toi_state(TOI_NOTIFIERS_PREPARE); -+ -+ result = usermodehelper_disable(); -+ if (result) { -+ printk(KERN_ERR "TuxOnIce: Failed to disable usermode " -+ "helpers\n"); -+ set_result_state(TOI_USERMODE_HELPERS_ERR); -+ return 1; -+ } -+ -+ boot_kernel_data_buffer = toi_get_zeroed_page(37, TOI_ATOMIC_GFP); -+ if (!boot_kernel_data_buffer) { -+ printk(KERN_ERR "TuxOnIce: Failed to allocate " -+ "boot_kernel_data_buffer.\n"); -+ set_result_state(TOI_OUT_OF_MEMORY); -+ return 1; -+ } -+ -+ if (test_action_state(TOI_LATE_CPU_HOTPLUG) || -+ !disable_nonboot_cpus()) -+ return 1; -+ -+ set_abort_result(TOI_CPU_HOTPLUG_FAILED); -+ return 0; -+} -+ -+/** -+ * can_hibernate - perform basic 'Can we hibernate?' tests -+ * -+ * Perform basic tests that must pass if we're going to be able to hibernate: -+ * Can we get the pm_mutex? Is resume= valid (we need to know where to write -+ * the image header). -+ **/ -+static int can_hibernate(void) -+{ -+ if (!test_toi_state(TOI_CAN_HIBERNATE)) -+ toi_attempt_to_parse_resume_device(0); -+ -+ if (!test_toi_state(TOI_CAN_HIBERNATE)) { -+ printk(KERN_INFO "TuxOnIce: Hibernation is disabled.\n" -+ "This may be because you haven't put something along " -+ "the lines of\n\nresume=swap:/dev/hda1\n\n" -+ "in lilo.conf or equivalent. (Where /dev/hda1 is your " -+ "swap partition).\n"); -+ set_abort_result(TOI_CANT_SUSPEND); -+ return 0; -+ } -+ -+ if (strlen(alt_resume_param)) { -+ attempt_to_parse_alt_resume_param(); -+ -+ if (!strlen(alt_resume_param)) { -+ printk(KERN_INFO "Alternate resume parameter now " -+ "invalid. Aborting.\n"); -+ set_abort_result(TOI_CANT_USE_ALT_RESUME); -+ return 0; -+ } -+ } -+ -+ return 1; -+} -+ -+/** -+ * do_post_image_write - having written an image, figure out what to do next -+ * -+ * After writing an image, we might load an alternate image or power down. -+ * Powering down might involve hibernating to ram, in which case we also -+ * need to handle reloading pageset2. -+ **/ -+static int do_post_image_write(void) -+{ -+ /* If switching images fails, do normal powerdown */ -+ if (alt_resume_param[0]) -+ do_toi_step(STEP_RESUME_ALT_IMAGE); -+ -+ toi_power_down(); -+ -+ barrier(); -+ mb(); -+ return 0; -+} -+ -+/** -+ * __save_image - do the hard work of saving the image -+ * -+ * High level routine for getting the image saved. The key assumptions made -+ * are that processes have been frozen and sufficient memory is available. -+ * -+ * We also exit through here at resume time, coming back from toi_hibernate -+ * after the atomic restore. This is the reason for the toi_in_hibernate -+ * test. -+ **/ -+static int __save_image(void) -+{ -+ int temp_result, did_copy = 0; -+ -+ toi_prepare_status(DONT_CLEAR_BAR, "Starting to save the image.."); -+ -+ toi_message(TOI_ANY_SECTION, TOI_LOW, 1, -+ " - Final values: %d and %d.", -+ pagedir1.size, pagedir2.size); -+ -+ toi_cond_pause(1, "About to write pagedir2."); -+ -+ temp_result = write_pageset(&pagedir2); -+ -+ if (temp_result == -1 || test_result_state(TOI_ABORTED)) -+ return 1; -+ -+ toi_cond_pause(1, "About to copy pageset 1."); -+ -+ if (test_result_state(TOI_ABORTED)) -+ return 1; -+ -+ toi_deactivate_storage(1); -+ -+ toi_prepare_status(DONT_CLEAR_BAR, "Doing atomic copy/restore."); -+ -+ toi_in_hibernate = 1; -+ -+ if (toi_go_atomic(PMSG_FREEZE, 1)) -+ goto Failed; -+ -+ temp_result = toi_hibernate(); -+ if (!temp_result) -+ did_copy = 1; -+ -+ /* We return here at resume time too! */ -+ toi_end_atomic(ATOMIC_ALL_STEPS, toi_in_hibernate, temp_result); -+ -+Failed: -+ if (toi_activate_storage(1)) -+ panic("Failed to reactivate our storage."); -+ -+ /* Resume time? */ -+ if (!toi_in_hibernate) { -+ copyback_post(); -+ return 0; -+ } -+ -+ /* Nope. Hibernating. So, see if we can save the image... */ -+ -+ if (temp_result || test_result_state(TOI_ABORTED)) { -+ if (did_copy) -+ goto abort_reloading_pagedir_two; -+ else -+ return 1; -+ } -+ -+ toi_update_status(pagedir2.size, pagedir1.size + pagedir2.size, -+ NULL); -+ -+ if (test_result_state(TOI_ABORTED)) -+ goto abort_reloading_pagedir_two; -+ -+ toi_cond_pause(1, "About to write pageset1."); -+ -+ toi_message(TOI_ANY_SECTION, TOI_LOW, 1, "-- Writing pageset1"); -+ -+ temp_result = write_pageset(&pagedir1); -+ -+ /* We didn't overwrite any memory, so no reread needs to be done. */ -+ if (test_action_state(TOI_TEST_FILTER_SPEED)) -+ return 1; -+ -+ if (temp_result == 1 || test_result_state(TOI_ABORTED)) -+ goto abort_reloading_pagedir_two; -+ -+ toi_cond_pause(1, "About to write header."); -+ -+ if (test_result_state(TOI_ABORTED)) -+ goto abort_reloading_pagedir_two; -+ -+ temp_result = write_image_header(); -+ -+ if (test_action_state(TOI_TEST_BIO)) -+ return 1; -+ -+ if (!temp_result && !test_result_state(TOI_ABORTED)) -+ return 0; -+ -+abort_reloading_pagedir_two: -+ temp_result = read_pageset2(1); -+ -+ /* If that failed, we're sunk. Panic! */ -+ if (temp_result) -+ panic("Attempt to reload pagedir 2 while aborting " -+ "a hibernate failed."); -+ -+ return 1; -+} -+ -+static void map_ps2_pages(int enable) -+{ -+ unsigned long pfn = 0; -+ -+ pfn = memory_bm_next_pfn(pageset2_map); -+ -+ while (pfn != BM_END_OF_MAP) { -+ struct page *page = pfn_to_page(pfn); -+ kernel_map_pages(page, 1, enable); -+ pfn = memory_bm_next_pfn(pageset2_map); -+ } -+} -+ -+/** -+ * do_save_image - save the image and handle the result -+ * -+ * Save the prepared image. If we fail or we're in the path returning -+ * from the atomic restore, cleanup. -+ **/ -+static int do_save_image(void) -+{ -+ int result; -+ map_ps2_pages(0); -+ result = __save_image(); -+ map_ps2_pages(1); -+ return result; -+} -+ -+/** -+ * do_prepare_image - try to prepare an image -+ * -+ * Seek to initialise and prepare an image to be saved. On failure, -+ * cleanup. -+ **/ -+static int do_prepare_image(void) -+{ -+ int restarting = test_result_state(TOI_EXTRA_PAGES_ALLOW_TOO_SMALL); -+ -+ if (!restarting && toi_activate_storage(0)) -+ return 1; -+ -+ /* -+ * If kept image and still keeping image and hibernating to RAM, we will -+ * return 1 after hibernating and resuming (provided the power doesn't -+ * run out. In that case, we skip directly to cleaning up and exiting. -+ */ -+ -+ if (!can_hibernate() || -+ (test_result_state(TOI_KEPT_IMAGE) && -+ check_still_keeping_image())) -+ return 1; -+ -+ if (toi_init(restarting) && !toi_prepare_image() && -+ !test_result_state(TOI_ABORTED)) -+ return 0; -+ -+ trap_non_toi_io = 1; -+ -+ return 1; -+} -+ -+/** -+ * do_check_can_resume - find out whether an image has been stored -+ * -+ * Read whether an image exists. We use the same routine as the -+ * image_exists sysfs entry, and just look to see whether the -+ * first character in the resulting buffer is a '1'. -+ **/ -+int do_check_can_resume(void) -+{ -+ int result = -1; -+ -+ if (toi_activate_storage(0)) -+ return -1; -+ -+ if (!test_toi_state(TOI_RESUME_DEVICE_OK)) -+ toi_attempt_to_parse_resume_device(1); -+ -+ if (toiActiveAllocator) -+ result = toiActiveAllocator->image_exists(1); -+ -+ toi_deactivate_storage(0); -+ return result; -+} -+EXPORT_SYMBOL_GPL(do_check_can_resume); -+ -+/** -+ * do_load_atomic_copy - load the first part of an image, if it exists -+ * -+ * Check whether we have an image. If one exists, do sanity checking -+ * (possibly invalidating the image or even rebooting if the user -+ * requests that) before loading it into memory in preparation for the -+ * atomic restore. -+ * -+ * If and only if we have an image loaded and ready to restore, we return 1. -+ **/ -+static int do_load_atomic_copy(void) -+{ -+ int read_image_result = 0; -+ -+ if (sizeof(swp_entry_t) != sizeof(long)) { -+ printk(KERN_WARNING "TuxOnIce: The size of swp_entry_t != size" -+ " of long. Please report this!\n"); -+ return 1; -+ } -+ -+ if (!resume_file[0]) -+ printk(KERN_WARNING "TuxOnIce: " -+ "You need to use a resume= command line parameter to " -+ "tell TuxOnIce where to look for an image.\n"); -+ -+ toi_activate_storage(0); -+ -+ if (!(test_toi_state(TOI_RESUME_DEVICE_OK)) && -+ !toi_attempt_to_parse_resume_device(0)) { -+ /* -+ * Without a usable storage device we can do nothing - -+ * even if noresume is given -+ */ -+ -+ if (!toiNumAllocators) -+ printk(KERN_ALERT "TuxOnIce: " -+ "No storage allocators have been registered.\n"); -+ else -+ printk(KERN_ALERT "TuxOnIce: " -+ "Missing or invalid storage location " -+ "(resume= parameter). Please correct and " -+ "rerun lilo (or equivalent) before " -+ "hibernating.\n"); -+ toi_deactivate_storage(0); -+ return 1; -+ } -+ -+ if (allocate_bitmaps()) -+ return 1; -+ -+ read_image_result = read_pageset1(); /* non fatal error ignored */ -+ -+ if (test_toi_state(TOI_NORESUME_SPECIFIED)) -+ clear_toi_state(TOI_NORESUME_SPECIFIED); -+ -+ toi_deactivate_storage(0); -+ -+ if (read_image_result) -+ return 1; -+ -+ return 0; -+} -+ -+/** -+ * prepare_restore_load_alt_image - save & restore alt image variables -+ * -+ * Save and restore the pageset1 maps, when loading an alternate image. -+ **/ -+static void prepare_restore_load_alt_image(int prepare) -+{ -+ static struct memory_bitmap *pageset1_map_save, *pageset1_copy_map_save; -+ -+ if (prepare) { -+ pageset1_map_save = pageset1_map; -+ pageset1_map = NULL; -+ pageset1_copy_map_save = pageset1_copy_map; -+ pageset1_copy_map = NULL; -+ set_toi_state(TOI_LOADING_ALT_IMAGE); -+ toi_reset_alt_image_pageset2_pfn(); -+ } else { -+ memory_bm_free(pageset1_map, 0); -+ pageset1_map = pageset1_map_save; -+ memory_bm_free(pageset1_copy_map, 0); -+ pageset1_copy_map = pageset1_copy_map_save; -+ clear_toi_state(TOI_NOW_RESUMING); -+ clear_toi_state(TOI_LOADING_ALT_IMAGE); -+ } -+} -+ -+/** -+ * do_toi_step - perform a step in hibernating or resuming -+ * -+ * Perform a step in hibernating or resuming an image. This abstraction -+ * is in preparation for implementing cluster support, and perhaps replacing -+ * uswsusp too (haven't looked whether that's possible yet). -+ **/ -+int do_toi_step(int step) -+{ -+ switch (step) { -+ case STEP_HIBERNATE_PREPARE_IMAGE: -+ return do_prepare_image(); -+ case STEP_HIBERNATE_SAVE_IMAGE: -+ return do_save_image(); -+ case STEP_HIBERNATE_POWERDOWN: -+ return do_post_image_write(); -+ case STEP_RESUME_CAN_RESUME: -+ return do_check_can_resume(); -+ case STEP_RESUME_LOAD_PS1: -+ return do_load_atomic_copy(); -+ case STEP_RESUME_DO_RESTORE: -+ /* -+ * If we succeed, this doesn't return. -+ * Instead, we return from do_save_image() in the -+ * hibernated kernel. -+ */ -+ return toi_atomic_restore(); -+ case STEP_RESUME_ALT_IMAGE: -+ printk(KERN_INFO "Trying to resume alternate image.\n"); -+ toi_in_hibernate = 0; -+ save_restore_alt_param(SAVE, NOQUIET); -+ prepare_restore_load_alt_image(1); -+ if (!do_check_can_resume()) { -+ printk(KERN_INFO "Nothing to resume from.\n"); -+ goto out; -+ } -+ if (!do_load_atomic_copy()) -+ toi_atomic_restore(); -+ -+ printk(KERN_INFO "Failed to load image.\n"); -+out: -+ prepare_restore_load_alt_image(0); -+ save_restore_alt_param(RESTORE, NOQUIET); -+ break; -+ case STEP_CLEANUP: -+ do_cleanup(1, 0); -+ break; -+ case STEP_QUIET_CLEANUP: -+ do_cleanup(0, 0); -+ break; -+ } -+ -+ return 0; -+} -+EXPORT_SYMBOL_GPL(do_toi_step); -+ -+/* -- Functions for kickstarting a hibernate or resume --- */ -+ -+/** -+ * toi_try_resume - try to do the steps in resuming -+ * -+ * Check if we have an image and if so try to resume. Clear the status -+ * flags too. -+ **/ -+void toi_try_resume(void) -+{ -+ set_toi_state(TOI_TRYING_TO_RESUME); -+ resume_attempted = 1; -+ -+ current->flags |= PF_MEMALLOC; -+ -+ if (do_toi_step(STEP_RESUME_CAN_RESUME) && -+ !do_toi_step(STEP_RESUME_LOAD_PS1)) -+ do_toi_step(STEP_RESUME_DO_RESTORE); -+ -+ do_cleanup(0, 0); -+ -+ current->flags &= ~PF_MEMALLOC; -+ -+ clear_toi_state(TOI_IGNORE_LOGLEVEL); -+ clear_toi_state(TOI_TRYING_TO_RESUME); -+ clear_toi_state(TOI_NOW_RESUMING); -+} -+ -+/** -+ * toi_sys_power_disk_try_resume - wrapper calling toi_try_resume -+ * -+ * Wrapper for when __toi_try_resume is called from swsusp resume path, -+ * rather than from echo > /sys/power/tuxonice/do_resume. -+ **/ -+static void toi_sys_power_disk_try_resume(void) -+{ -+ resume_attempted = 1; -+ -+ /* -+ * There's a comment in kernel/power/disk.c that indicates -+ * we should be able to use mutex_lock_nested below. That -+ * doesn't seem to cut it, though, so let's just turn lockdep -+ * off for now. -+ */ -+ lockdep_off(); -+ -+ if (toi_start_anything(SYSFS_RESUMING)) -+ goto out; -+ -+ toi_try_resume(); -+ -+ /* -+ * For initramfs, we have to clear the boot time -+ * flag after trying to resume -+ */ -+ clear_toi_state(TOI_BOOT_TIME); -+ -+ toi_finish_anything(SYSFS_RESUMING); -+out: -+ lockdep_on(); -+} -+ -+/** -+ * toi_try_hibernate - try to start a hibernation cycle -+ * -+ * Start a hibernation cycle, coming in from either -+ * echo > /sys/power/tuxonice/do_suspend -+ * -+ * or -+ * -+ * echo disk > /sys/power/state -+ * -+ * In the later case, we come in without pm_sem taken; in the -+ * former, it has been taken. -+ **/ -+int toi_try_hibernate(void) -+{ -+ int result = 0, sys_power_disk = 0, retries = 0; -+ -+ if (!mutex_is_locked(&tuxonice_in_use)) { -+ /* Came in via /sys/power/disk */ -+ if (toi_start_anything(SYSFS_HIBERNATING)) -+ return -EBUSY; -+ sys_power_disk = 1; -+ } -+ -+ current->flags |= PF_MEMALLOC; -+ -+ if (test_toi_state(TOI_CLUSTER_MODE)) { -+ toi_initiate_cluster_hibernate(); -+ goto out; -+ } -+ -+prepare: -+ result = do_toi_step(STEP_HIBERNATE_PREPARE_IMAGE); -+ -+ if (result || test_action_state(TOI_FREEZER_TEST)) -+ goto out; -+ -+ result = do_toi_step(STEP_HIBERNATE_SAVE_IMAGE); -+ -+ if (test_result_state(TOI_EXTRA_PAGES_ALLOW_TOO_SMALL)) { -+ if (retries < 2) { -+ do_cleanup(0, 1); -+ retries++; -+ clear_result_state(TOI_ABORTED); -+ extra_pd1_pages_allowance = extra_pd1_pages_used + 500; -+ printk(KERN_INFO "Automatically adjusting the extra" -+ " pages allowance to %ld and restarting.\n", -+ extra_pd1_pages_allowance); -+ goto prepare; -+ } -+ -+ printk(KERN_INFO "Adjusted extra pages allowance twice and " -+ "still couldn't hibernate successfully. Giving up."); -+ } -+ -+ /* This code runs at resume time too! */ -+ if (!result && toi_in_hibernate) -+ result = do_toi_step(STEP_HIBERNATE_POWERDOWN); -+out: -+ do_cleanup(1, 0); -+ current->flags &= ~PF_MEMALLOC; -+ -+ if (sys_power_disk) -+ toi_finish_anything(SYSFS_HIBERNATING); -+ -+ return result; -+} -+ -+/* -+ * channel_no: If !0, -c is added to args (userui). -+ */ -+int toi_launch_userspace_program(char *command, int channel_no, -+ enum umh_wait wait, int debug) -+{ -+ int retval; -+ static char *envp[] = { -+ "HOME=/", -+ "TERM=linux", -+ "PATH=/sbin:/usr/sbin:/bin:/usr/bin", -+ NULL }; -+ static char *argv[] = { NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL -+ }; -+ char *channel = NULL; -+ int arg = 0, size; -+ char test_read[255]; -+ char *orig_posn = command; -+ -+ if (!strlen(orig_posn)) -+ return 1; -+ -+ if (channel_no) { -+ channel = toi_kzalloc(4, 6, GFP_KERNEL); -+ if (!channel) { -+ printk(KERN_INFO "Failed to allocate memory in " -+ "preparing to launch userspace program.\n"); -+ return 1; -+ } -+ } -+ -+ /* Up to 6 args supported */ -+ while (arg < 6) { -+ sscanf(orig_posn, "%s", test_read); -+ size = strlen(test_read); -+ if (!(size)) -+ break; -+ argv[arg] = toi_kzalloc(5, size + 1, TOI_ATOMIC_GFP); -+ strcpy(argv[arg], test_read); -+ orig_posn += size + 1; -+ *test_read = 0; -+ arg++; -+ } -+ -+ if (channel_no) { -+ sprintf(channel, "-c%d", channel_no); -+ argv[arg] = channel; -+ } else -+ arg--; -+ -+ if (debug) { -+ argv[++arg] = toi_kzalloc(5, 8, TOI_ATOMIC_GFP); -+ strcpy(argv[arg], "--debug"); -+ } -+ -+ retval = call_usermodehelper(argv[0], argv, envp, wait); -+ -+ /* -+ * If the program reports an error, retval = 256. Don't complain -+ * about that here. -+ */ -+ if (retval && retval != 256) -+ printk(KERN_ERR "Failed to launch userspace program '%s': " -+ "Error %d\n", command, retval); -+ -+ { -+ int i; -+ for (i = 0; i < arg; i++) -+ if (argv[i] && argv[i] != channel) -+ toi_kfree(5, argv[i], sizeof(*argv[i])); -+ } -+ -+ toi_kfree(4, channel, sizeof(*channel)); -+ -+ return retval; -+} -+ -+/* -+ * This array contains entries that are automatically registered at -+ * boot. Modules and the console code register their own entries separately. -+ */ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_INT("freezer_sync", SYSFS_RW, &freezer_sync, 0, 1, 0, NULL), -+ SYSFS_LONG("extra_pages_allowance", SYSFS_RW, -+ &extra_pd1_pages_allowance, 0, LONG_MAX, 0), -+ SYSFS_CUSTOM("image_exists", SYSFS_RW, image_exists_read, -+ image_exists_write, SYSFS_NEEDS_SM_FOR_BOTH, NULL), -+ SYSFS_STRING("resume", SYSFS_RW, resume_file, 255, -+ SYSFS_NEEDS_SM_FOR_WRITE, -+ attempt_to_parse_resume_device2), -+ SYSFS_STRING("alt_resume_param", SYSFS_RW, alt_resume_param, 255, -+ SYSFS_NEEDS_SM_FOR_WRITE, -+ attempt_to_parse_alt_resume_param), -+ SYSFS_CUSTOM("debug_info", SYSFS_READONLY, get_toi_debug_info, NULL, 0, -+ NULL), -+ SYSFS_BIT("ignore_rootfs", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_IGNORE_ROOTFS, 0), -+ SYSFS_LONG("image_size_limit", SYSFS_RW, &image_size_limit, -2, -+ INT_MAX, 0), -+ SYSFS_UL("last_result", SYSFS_RW, &toi_result, 0, 0, 0), -+ SYSFS_BIT("no_multithreaded_io", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_NO_MULTITHREADED_IO, 0), -+ SYSFS_BIT("no_flusher_thread", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_NO_FLUSHER_THREAD, 0), -+ SYSFS_BIT("full_pageset2", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_PAGESET2_FULL, 0), -+ SYSFS_BIT("reboot", SYSFS_RW, &toi_bkd.toi_action, TOI_REBOOT, 0), -+ SYSFS_BIT("replace_swsusp", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_REPLACE_SWSUSP, 0), -+ SYSFS_STRING("resume_commandline", SYSFS_RW, -+ toi_bkd.toi_nosave_commandline, COMMAND_LINE_SIZE, 0, -+ NULL), -+ SYSFS_STRING("version", SYSFS_READONLY, TOI_CORE_VERSION, 0, 0, NULL), -+ SYSFS_BIT("freezer_test", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_FREEZER_TEST, 0), -+ SYSFS_BIT("test_bio", SYSFS_RW, &toi_bkd.toi_action, TOI_TEST_BIO, 0), -+ SYSFS_BIT("test_filter_speed", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_TEST_FILTER_SPEED, 0), -+ SYSFS_BIT("no_pageset2", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_NO_PAGESET2, 0), -+ SYSFS_BIT("no_pageset2_if_unneeded", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_NO_PS2_IF_UNNEEDED, 0), -+ SYSFS_BIT("late_cpu_hotplug", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_LATE_CPU_HOTPLUG, 0), -+ SYSFS_STRING("binary_signature", SYSFS_READONLY, -+ tuxonice_signature, 9, 0, NULL), -+ SYSFS_INT("max_workers", SYSFS_RW, &toi_max_workers, 0, NR_CPUS, 0, -+ NULL), -+#ifdef CONFIG_TOI_KEEP_IMAGE -+ SYSFS_BIT("keep_image", SYSFS_RW , &toi_bkd.toi_action, TOI_KEEP_IMAGE, -+ 0), -+#endif -+}; -+ -+static struct toi_core_fns my_fns = { -+ .get_nonconflicting_page = __toi_get_nonconflicting_page, -+ .post_context_save = __toi_post_context_save, -+ .try_hibernate = toi_try_hibernate, -+ .try_resume = toi_sys_power_disk_try_resume, -+}; -+ -+/** -+ * core_load - initialisation of TuxOnIce core -+ * -+ * Initialise the core, beginning with sysfs. Checksum and so on are part of -+ * the core, but have their own initialisation routines because they either -+ * aren't compiled in all the time or have their own subdirectories. -+ **/ -+static __init int core_load(void) -+{ -+ int i, -+ numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data); -+ -+ printk(KERN_INFO "TuxOnIce " TOI_CORE_VERSION -+ " (http://tuxonice.net)\n"); -+ -+ if (toi_sysfs_init()) -+ return 1; -+ -+ for (i = 0; i < numfiles; i++) -+ toi_register_sysfs_file(tuxonice_kobj, &sysfs_params[i]); -+ -+ toi_core_fns = &my_fns; -+ -+ if (toi_alloc_init()) -+ return 1; -+ if (toi_checksum_init()) -+ return 1; -+ if (toi_usm_init()) -+ return 1; -+ if (toi_ui_init()) -+ return 1; -+ if (toi_poweroff_init()) -+ return 1; -+ if (toi_cluster_init()) -+ return 1; -+ -+ return 0; -+} -+ -+#ifdef MODULE -+/** -+ * core_unload: Prepare to unload the core code. -+ **/ -+static __exit void core_unload(void) -+{ -+ int i, -+ numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data); -+ -+ toi_alloc_exit(); -+ toi_checksum_exit(); -+ toi_poweroff_exit(); -+ toi_ui_exit(); -+ toi_usm_exit(); -+ toi_cluster_exit(); -+ -+ for (i = 0; i < numfiles; i++) -+ toi_unregister_sysfs_file(tuxonice_kobj, &sysfs_params[i]); -+ -+ toi_core_fns = NULL; -+ -+ toi_sysfs_exit(); -+} -+MODULE_LICENSE("GPL"); -+module_init(core_load); -+module_exit(core_unload); -+#else -+late_initcall(core_load); -+#endif -diff --git a/kernel/power/tuxonice_io.c b/kernel/power/tuxonice_io.c -new file mode 100644 -index 0000000..02be4d9 ---- /dev/null -+++ b/kernel/power/tuxonice_io.c -@@ -0,0 +1,1822 @@ -+/* -+ * kernel/power/tuxonice_io.c -+ * -+ * Copyright (C) 1998-2001 Gabor Kuti -+ * Copyright (C) 1998,2001,2002 Pavel Machek -+ * Copyright (C) 2002-2003 Florent Chabaud -+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * It contains high level IO routines for hibernating. -+ * -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_pageflags.h" -+#include "tuxonice_io.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_storage.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice_extent.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_checksum.h" -+#include "tuxonice_alloc.h" -+char alt_resume_param[256]; -+ -+/* Version read from image header at resume */ -+static int toi_image_header_version; -+ -+#define read_if_version(VERS, VAR, DESC) do { \ -+ if (likely(toi_image_header_version >= VERS)) \ -+ if (toiActiveAllocator->rw_header_chunk(READ, NULL, \ -+ (char *) &VAR, sizeof(VAR))) { \ -+ abort_hibernate(TOI_FAILED_IO, "Failed to read DESC."); \ -+ goto out_remove_image; \ -+ } \ -+} while(0) \ -+ -+/* Variables shared between threads and updated under the mutex */ -+static int io_write, io_finish_at, io_base, io_barmax, io_pageset, io_result; -+static int io_index, io_nextupdate, io_pc, io_pc_step; -+static DEFINE_MUTEX(io_mutex); -+static DEFINE_PER_CPU(struct page *, last_sought); -+static DEFINE_PER_CPU(struct page *, last_high_page); -+static DEFINE_PER_CPU(char *, checksum_locn); -+static DEFINE_PER_CPU(struct pbe *, last_low_page); -+static atomic_t io_count; -+atomic_t toi_io_workers; -+EXPORT_SYMBOL_GPL(toi_io_workers); -+ -+DECLARE_WAIT_QUEUE_HEAD(toi_io_queue_flusher); -+EXPORT_SYMBOL_GPL(toi_io_queue_flusher); -+ -+int toi_bio_queue_flusher_should_finish; -+EXPORT_SYMBOL_GPL(toi_bio_queue_flusher_should_finish); -+ -+/* Indicates that this thread should be used for checking throughput */ -+#define MONITOR ((void *) 1) -+ -+int toi_max_workers; -+ -+static char *image_version_error = "The image header version is newer than " \ -+ "this kernel supports."; -+ -+/** -+ * toi_attempt_to_parse_resume_device - determine if we can hibernate -+ * -+ * Can we hibernate, using the current resume= parameter? -+ **/ -+int toi_attempt_to_parse_resume_device(int quiet) -+{ -+ struct list_head *Allocator; -+ struct toi_module_ops *thisAllocator; -+ int result, returning = 0; -+ -+ if (toi_activate_storage(0)) -+ return 0; -+ -+ toiActiveAllocator = NULL; -+ clear_toi_state(TOI_RESUME_DEVICE_OK); -+ clear_toi_state(TOI_CAN_RESUME); -+ clear_result_state(TOI_ABORTED); -+ -+ if (!toiNumAllocators) { -+ if (!quiet) -+ printk(KERN_INFO "TuxOnIce: No storage allocators have " -+ "been registered. Hibernating will be " -+ "disabled.\n"); -+ goto cleanup; -+ } -+ -+ list_for_each(Allocator, &toiAllocators) { -+ thisAllocator = list_entry(Allocator, struct toi_module_ops, -+ type_list); -+ -+ /* -+ * Not sure why you'd want to disable an allocator, but -+ * we should honour the flag if we're providing it -+ */ -+ if (!thisAllocator->enabled) -+ continue; -+ -+ result = thisAllocator->parse_sig_location( -+ resume_file, (toiNumAllocators == 1), -+ quiet); -+ -+ switch (result) { -+ case -EINVAL: -+ /* For this allocator, but not a valid -+ * configuration. Error already printed. */ -+ goto cleanup; -+ -+ case 0: -+ /* For this allocator and valid. */ -+ toiActiveAllocator = thisAllocator; -+ -+ set_toi_state(TOI_RESUME_DEVICE_OK); -+ set_toi_state(TOI_CAN_RESUME); -+ returning = 1; -+ goto cleanup; -+ } -+ } -+ if (!quiet) -+ printk(KERN_INFO "TuxOnIce: No matching enabled allocator " -+ "found. Resuming disabled.\n"); -+cleanup: -+ toi_deactivate_storage(0); -+ return returning; -+} -+EXPORT_SYMBOL_GPL(toi_attempt_to_parse_resume_device); -+ -+void attempt_to_parse_resume_device2(void) -+{ -+ toi_prepare_usm(); -+ toi_attempt_to_parse_resume_device(0); -+ toi_cleanup_usm(); -+} -+EXPORT_SYMBOL_GPL(attempt_to_parse_resume_device2); -+ -+void save_restore_alt_param(int replace, int quiet) -+{ -+ static char resume_param_save[255]; -+ static unsigned long toi_state_save; -+ -+ if (replace) { -+ toi_state_save = toi_state; -+ strcpy(resume_param_save, resume_file); -+ strcpy(resume_file, alt_resume_param); -+ } else { -+ strcpy(resume_file, resume_param_save); -+ toi_state = toi_state_save; -+ } -+ toi_attempt_to_parse_resume_device(quiet); -+} -+ -+void attempt_to_parse_alt_resume_param(void) -+{ -+ int ok = 0; -+ -+ /* Temporarily set resume_param to the poweroff value */ -+ if (!strlen(alt_resume_param)) -+ return; -+ -+ printk(KERN_INFO "=== Trying Poweroff Resume2 ===\n"); -+ save_restore_alt_param(SAVE, NOQUIET); -+ if (test_toi_state(TOI_CAN_RESUME)) -+ ok = 1; -+ -+ printk(KERN_INFO "=== Done ===\n"); -+ save_restore_alt_param(RESTORE, QUIET); -+ -+ /* If not ok, clear the string */ -+ if (ok) -+ return; -+ -+ printk(KERN_INFO "Can't resume from that location; clearing " -+ "alt_resume_param.\n"); -+ alt_resume_param[0] = '\0'; -+} -+ -+/** -+ * noresume_reset_modules - reset data structures in case of non resuming -+ * -+ * When we read the start of an image, modules (and especially the -+ * active allocator) might need to reset data structures if we -+ * decide to remove the image rather than resuming from it. -+ **/ -+static void noresume_reset_modules(void) -+{ -+ struct toi_module_ops *this_filter; -+ -+ list_for_each_entry(this_filter, &toi_filters, type_list) -+ if (this_filter->noresume_reset) -+ this_filter->noresume_reset(); -+ -+ if (toiActiveAllocator && toiActiveAllocator->noresume_reset) -+ toiActiveAllocator->noresume_reset(); -+} -+ -+/** -+ * fill_toi_header - fill the hibernate header structure -+ * @struct toi_header: Header data structure to be filled. -+ **/ -+static int fill_toi_header(struct toi_header *sh) -+{ -+ int i, error; -+ -+ error = init_header((struct swsusp_info *) sh); -+ if (error) -+ return error; -+ -+ sh->pagedir = pagedir1; -+ sh->pageset_2_size = pagedir2.size; -+ sh->param0 = toi_result; -+ sh->param1 = toi_bkd.toi_action; -+ sh->param2 = toi_bkd.toi_debug_state; -+ sh->param3 = toi_bkd.toi_default_console_level; -+ sh->root_fs = current->fs->root.mnt->mnt_sb->s_dev; -+ for (i = 0; i < 4; i++) -+ sh->io_time[i/2][i%2] = toi_bkd.toi_io_time[i/2][i%2]; -+ sh->bkd = boot_kernel_data_buffer; -+ return 0; -+} -+ -+/** -+ * rw_init_modules - initialize modules -+ * @rw: Whether we are reading of writing an image. -+ * @which: Section of the image being processed. -+ * -+ * Iterate over modules, preparing the ones that will be used to read or write -+ * data. -+ **/ -+static int rw_init_modules(int rw, int which) -+{ -+ struct toi_module_ops *this_module; -+ /* Initialise page transformers */ -+ list_for_each_entry(this_module, &toi_filters, type_list) { -+ if (!this_module->enabled) -+ continue; -+ if (this_module->rw_init && this_module->rw_init(rw, which)) { -+ abort_hibernate(TOI_FAILED_MODULE_INIT, -+ "Failed to initialize the %s filter.", -+ this_module->name); -+ return 1; -+ } -+ } -+ -+ /* Initialise allocator */ -+ if (toiActiveAllocator->rw_init(rw, which)) { -+ abort_hibernate(TOI_FAILED_MODULE_INIT, -+ "Failed to initialise the allocator."); -+ return 1; -+ } -+ -+ /* Initialise other modules */ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || -+ this_module->type == FILTER_MODULE || -+ this_module->type == WRITER_MODULE) -+ continue; -+ if (this_module->rw_init && this_module->rw_init(rw, which)) { -+ set_abort_result(TOI_FAILED_MODULE_INIT); -+ printk(KERN_INFO "Setting aborted flag due to module " -+ "init failure.\n"); -+ return 1; -+ } -+ } -+ -+ return 0; -+} -+ -+/** -+ * rw_cleanup_modules - cleanup modules -+ * @rw: Whether we are reading of writing an image. -+ * -+ * Cleanup components after reading or writing a set of pages. -+ * Only the allocator may fail. -+ **/ -+static int rw_cleanup_modules(int rw) -+{ -+ struct toi_module_ops *this_module; -+ int result = 0; -+ -+ /* Cleanup other modules */ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || -+ this_module->type == FILTER_MODULE || -+ this_module->type == WRITER_MODULE) -+ continue; -+ if (this_module->rw_cleanup) -+ result |= this_module->rw_cleanup(rw); -+ } -+ -+ /* Flush data and cleanup */ -+ list_for_each_entry(this_module, &toi_filters, type_list) { -+ if (!this_module->enabled) -+ continue; -+ if (this_module->rw_cleanup) -+ result |= this_module->rw_cleanup(rw); -+ } -+ -+ result |= toiActiveAllocator->rw_cleanup(rw); -+ -+ return result; -+} -+ -+static struct page *copy_page_from_orig_page(struct page *orig_page) -+{ -+ int is_high = PageHighMem(orig_page), index, min, max; -+ struct page *high_page = NULL, -+ **my_last_high_page = &__get_cpu_var(last_high_page), -+ **my_last_sought = &__get_cpu_var(last_sought); -+ struct pbe *this, **my_last_low_page = &__get_cpu_var(last_low_page); -+ void *compare; -+ -+ if (is_high) { -+ if (*my_last_sought && *my_last_high_page && -+ *my_last_sought < orig_page) -+ high_page = *my_last_high_page; -+ else -+ high_page = (struct page *) restore_highmem_pblist; -+ this = (struct pbe *) kmap(high_page); -+ compare = orig_page; -+ } else { -+ if (*my_last_sought && *my_last_low_page && -+ *my_last_sought < orig_page) -+ this = *my_last_low_page; -+ else -+ this = restore_pblist; -+ compare = page_address(orig_page); -+ } -+ -+ *my_last_sought = orig_page; -+ -+ /* Locate page containing pbe */ -+ while (this[PBES_PER_PAGE - 1].next && -+ this[PBES_PER_PAGE - 1].orig_address < compare) { -+ if (is_high) { -+ struct page *next_high_page = (struct page *) -+ this[PBES_PER_PAGE - 1].next; -+ kunmap(high_page); -+ this = kmap(next_high_page); -+ high_page = next_high_page; -+ } else -+ this = this[PBES_PER_PAGE - 1].next; -+ } -+ -+ /* Do a binary search within the page */ -+ min = 0; -+ max = PBES_PER_PAGE; -+ index = PBES_PER_PAGE / 2; -+ while (max - min) { -+ if (!this[index].orig_address || -+ this[index].orig_address > compare) -+ max = index; -+ else if (this[index].orig_address == compare) { -+ if (is_high) { -+ struct page *page = this[index].address; -+ *my_last_high_page = high_page; -+ kunmap(high_page); -+ return page; -+ } -+ *my_last_low_page = this; -+ return virt_to_page(this[index].address); -+ } else -+ min = index; -+ index = ((max + min) / 2); -+ }; -+ -+ if (is_high) -+ kunmap(high_page); -+ -+ abort_hibernate(TOI_FAILED_IO, "Failed to get destination page for" -+ " orig page %p. This[min].orig_address=%p.\n", orig_page, -+ this[index].orig_address); -+ return NULL; -+} -+ -+/** -+ * write_next_page - write the next page in a pageset -+ * @data_pfn: The pfn where the next data to write is located. -+ * @my_io_index: The index of the page in the pageset. -+ * @write_pfn: The pfn number to write in the image (where the data belongs). -+ * @first_filter: Where to send the page (optimisation). -+ * -+ * Get the pfn of the next page to write, map the page if necessary and do the -+ * write. -+ **/ -+static int write_next_page(unsigned long *data_pfn, int *my_io_index, -+ unsigned long *write_pfn, struct toi_module_ops *first_filter) -+{ -+ struct page *page; -+ char **my_checksum_locn = &__get_cpu_var(checksum_locn); -+ int result = 0, was_present; -+ -+ *data_pfn = memory_bm_next_pfn(io_map); -+ -+ /* Another thread could have beaten us to it. */ -+ if (*data_pfn == BM_END_OF_MAP) { -+ if (atomic_read(&io_count)) { -+ printk(KERN_INFO "Ran out of pfns but io_count is " -+ "still %d.\n", atomic_read(&io_count)); -+ BUG(); -+ } -+ mutex_unlock(&io_mutex); -+ return -ENODATA; -+ } -+ -+ *my_io_index = io_finish_at - atomic_sub_return(1, &io_count); -+ -+ memory_bm_clear_bit(io_map, *data_pfn); -+ page = pfn_to_page(*data_pfn); -+ -+ was_present = kernel_page_present(page); -+ if (!was_present) -+ kernel_map_pages(page, 1, 1); -+ -+ if (io_pageset == 1) -+ *write_pfn = memory_bm_next_pfn(pageset1_map); -+ else { -+ *write_pfn = *data_pfn; -+ *my_checksum_locn = tuxonice_get_next_checksum(); -+ } -+ -+ mutex_unlock(&io_mutex); -+ -+ if (io_pageset == 2 && tuxonice_calc_checksum(page, *my_checksum_locn)) -+ return 1; -+ -+ result = first_filter->write_page(*write_pfn, page, PAGE_SIZE); -+ -+ if (!was_present) -+ kernel_map_pages(page, 1, 0); -+ -+ return result; -+} -+ -+/** -+ * read_next_page - read the next page in a pageset -+ * @my_io_index: The index of the page in the pageset. -+ * @write_pfn: The pfn in which the data belongs. -+ * -+ * Read a page of the image into our buffer. It can happen (here and in the -+ * write routine) that threads don't get run until after other CPUs have done -+ * all the work. This was the cause of the long standing issue with -+ * occasionally getting -ENODATA errors at the end of reading the image. We -+ * therefore need to check there's actually a page to read before trying to -+ * retrieve one. -+ **/ -+ -+static int read_next_page(int *my_io_index, unsigned long *write_pfn, -+ struct page *buffer, struct toi_module_ops *first_filter) -+{ -+ unsigned int buf_size = PAGE_SIZE; -+ unsigned long left = atomic_read(&io_count); -+ -+ if (left) -+ *my_io_index = io_finish_at - atomic_sub_return(1, &io_count); -+ -+ mutex_unlock(&io_mutex); -+ -+ /* -+ * Are we aborting? If so, don't submit any more I/O as -+ * resetting the resume_attempted flag (from ui.c) will -+ * clear the bdev flags, making this thread oops. -+ */ -+ if (unlikely(test_toi_state(TOI_STOP_RESUME))) { -+ atomic_dec(&toi_io_workers); -+ if (!atomic_read(&toi_io_workers)) { -+ /* -+ * So we can be sure we'll have memory for -+ * marking that we haven't resumed. -+ */ -+ rw_cleanup_modules(READ); -+ set_toi_state(TOI_IO_STOPPED); -+ } -+ while (1) -+ schedule(); -+ } -+ -+ if (!left) -+ return -ENODATA; -+ -+ /* -+ * See toi_bio_read_page in tuxonice_bio.c: -+ * read the next page in the image. -+ */ -+ return first_filter->read_page(write_pfn, buffer, &buf_size); -+} -+ -+static void use_read_page(unsigned long write_pfn, struct page *buffer) -+{ -+ struct page *final_page = pfn_to_page(write_pfn), -+ *copy_page = final_page; -+ char *virt, *buffer_virt; -+ -+ if (io_pageset == 1 && !PagePageset1Copy(final_page)) { -+ copy_page = copy_page_from_orig_page(final_page); -+ BUG_ON(!copy_page); -+ } -+ -+ if (memory_bm_test_bit(io_map, write_pfn)) { -+ int was_present; -+ -+ virt = kmap(copy_page); -+ buffer_virt = kmap(buffer); -+ was_present = kernel_page_present(copy_page); -+ if (!was_present) -+ kernel_map_pages(copy_page, 1, 1); -+ memcpy(virt, buffer_virt, PAGE_SIZE); -+ if (!was_present) -+ kernel_map_pages(copy_page, 1, 0); -+ kunmap(copy_page); -+ kunmap(buffer); -+ memory_bm_clear_bit(io_map, write_pfn); -+ } else { -+ mutex_lock(&io_mutex); -+ atomic_inc(&io_count); -+ mutex_unlock(&io_mutex); -+ } -+} -+ -+static unsigned long status_update(int writing, unsigned long done, -+ unsigned long ticks) -+{ -+ int cs_index = writing ? 0 : 1; -+ unsigned long ticks_so_far = toi_bkd.toi_io_time[cs_index][1] + ticks; -+ unsigned long msec = jiffies_to_msecs(abs(ticks_so_far)); -+ unsigned long pgs_per_s, estimate = 0, pages_left; -+ -+ if (msec) { -+ pages_left = io_barmax - done; -+ pgs_per_s = 1000 * done / msec; -+ if (pgs_per_s) -+ estimate = pages_left / pgs_per_s; -+ } -+ -+ if (estimate && ticks > HZ / 2) -+ return toi_update_status(done, io_barmax, -+ " %d/%d MB (%lu sec left)", -+ MB(done+1), MB(io_barmax), estimate); -+ -+ return toi_update_status(done, io_barmax, " %d/%d MB", -+ MB(done+1), MB(io_barmax)); -+} -+ -+/** -+ * worker_rw_loop - main loop to read/write pages -+ * -+ * The main I/O loop for reading or writing pages. The io_map bitmap is used to -+ * track the pages to read/write. -+ * If we are reading, the pages are loaded to their final (mapped) pfn. -+ **/ -+static int worker_rw_loop(void *data) -+{ -+ unsigned long data_pfn, write_pfn, next_jiffies = jiffies + HZ / 4, -+ jif_index = 1, start_time = jiffies; -+ int result = 0, my_io_index = 0, last_worker; -+ struct toi_module_ops *first_filter = toi_get_next_filter(NULL); -+ struct page *buffer = toi_alloc_page(28, TOI_ATOMIC_GFP); -+ -+ current->flags |= PF_NOFREEZE; -+ -+ mutex_lock(&io_mutex); -+ -+ do { -+ if (data && jiffies > next_jiffies) { -+ next_jiffies += HZ / 4; -+ if (toiActiveAllocator->update_throughput_throttle) -+ toiActiveAllocator->update_throughput_throttle( -+ jif_index); -+ jif_index++; -+ } -+ -+ /* -+ * What page to use? If reading, don't know yet which page's -+ * data will be read, so always use the buffer. If writing, -+ * use the copy (Pageset1) or original page (Pageset2), but -+ * always write the pfn of the original page. -+ */ -+ if (io_write) -+ result = write_next_page(&data_pfn, &my_io_index, -+ &write_pfn, first_filter); -+ else /* Reading */ -+ result = read_next_page(&my_io_index, &write_pfn, -+ buffer, first_filter); -+ -+ if (result) { -+ mutex_lock(&io_mutex); -+ /* Nothing to do? */ -+ if (result == -ENODATA) -+ break; -+ -+ io_result = result; -+ -+ if (io_write) { -+ printk(KERN_INFO "Write chunk returned %d.\n", -+ result); -+ abort_hibernate(TOI_FAILED_IO, -+ "Failed to write a chunk of the " -+ "image."); -+ break; -+ } -+ -+ if (io_pageset == 1) { -+ printk(KERN_ERR "\nBreaking out of I/O loop " -+ "because of result code %d.\n", result); -+ break; -+ } -+ panic("Read chunk returned (%d)", result); -+ } -+ -+ /* -+ * Discard reads of resaved pages while reading ps2 -+ * and unwanted pages while rereading ps2 when aborting. -+ */ -+ if (!io_write && !PageResave(pfn_to_page(write_pfn))) -+ use_read_page(write_pfn, buffer); -+ -+ if (my_io_index + io_base == io_nextupdate) -+ io_nextupdate = status_update(io_write, my_io_index + -+ io_base, jiffies - start_time); -+ -+ if (my_io_index == io_pc) { -+ printk(KERN_CONT "...%d%%", 20 * io_pc_step); -+ io_pc_step++; -+ io_pc = io_finish_at * io_pc_step / 5; -+ } -+ -+ toi_cond_pause(0, NULL); -+ -+ /* -+ * Subtle: If there's less I/O still to be done than threads -+ * running, quit. This stops us doing I/O beyond the end of -+ * the image when reading. -+ * -+ * Possible race condition. Two threads could do the test at -+ * the same time; one should exit and one should continue. -+ * Therefore we take the mutex before comparing and exiting. -+ */ -+ -+ mutex_lock(&io_mutex); -+ -+ } while (atomic_read(&io_count) >= atomic_read(&toi_io_workers) && -+ !(io_write && test_result_state(TOI_ABORTED))); -+ -+ last_worker = atomic_dec_and_test(&toi_io_workers); -+ mutex_unlock(&io_mutex); -+ -+ if (last_worker) { -+ toi_bio_queue_flusher_should_finish = 1; -+ wake_up(&toi_io_queue_flusher); -+ result = toiActiveAllocator->finish_all_io(); -+ printk(KERN_CONT "\n"); -+ } -+ -+ toi__free_page(28, buffer); -+ -+ return result; -+} -+ -+static int start_other_threads(void) -+{ -+ int cpu, num_started = 0; -+ struct task_struct *p; -+ int to_start = (toi_max_workers ? toi_max_workers : num_online_cpus()) - 1; -+ -+ atomic_set(&toi_io_workers, to_start); -+ -+ for_each_online_cpu(cpu) { -+ if (num_started == to_start) -+ break; -+ -+ if (cpu == smp_processor_id()) -+ continue; -+ -+ p = kthread_create(worker_rw_loop, num_started ? NULL : MONITOR, -+ "ktoi_io/%d", cpu); -+ if (IS_ERR(p)) { -+ printk(KERN_ERR "ktoi_io for %i failed\n", cpu); -+ atomic_dec(&toi_io_workers); -+ continue; -+ } -+ kthread_bind(p, cpu); -+ p->flags |= PF_MEMALLOC; -+ wake_up_process(p); -+ num_started++; -+ } -+ -+ return num_started; -+} -+ -+/** -+ * do_rw_loop - main highlevel function for reading or writing pages -+ * -+ * Create the io_map bitmap and call worker_rw_loop to perform I/O operations. -+ **/ -+static int do_rw_loop(int write, int finish_at, struct memory_bitmap *pageflags, -+ int base, int barmax, int pageset) -+{ -+ int index = 0, cpu, num_other_threads = 0, result = 0; -+ unsigned long pfn; -+ -+ if (!finish_at) -+ return 0; -+ -+ io_write = write; -+ io_finish_at = finish_at; -+ io_base = base; -+ io_barmax = barmax; -+ io_pageset = pageset; -+ io_index = 0; -+ io_pc = io_finish_at / 5; -+ io_pc_step = 1; -+ io_result = 0; -+ io_nextupdate = base + 1; -+ toi_bio_queue_flusher_should_finish = 0; -+ -+ for_each_online_cpu(cpu) { -+ per_cpu(last_sought, cpu) = NULL; -+ per_cpu(last_low_page, cpu) = NULL; -+ per_cpu(last_high_page, cpu) = NULL; -+ } -+ -+ /* Ensure all bits clear */ -+ memory_bm_clear(io_map); -+ -+ /* Set the bits for the pages to write */ -+ memory_bm_position_reset(pageflags); -+ -+ pfn = memory_bm_next_pfn(pageflags); -+ -+ while (pfn != BM_END_OF_MAP && index < finish_at) { -+ memory_bm_set_bit(io_map, pfn); -+ pfn = memory_bm_next_pfn(pageflags); -+ index++; -+ } -+ -+ BUG_ON(index < finish_at); -+ -+ atomic_set(&io_count, finish_at); -+ -+ memory_bm_position_reset(pageset1_map); -+ -+ clear_toi_state(TOI_IO_STOPPED); -+ memory_bm_position_reset(io_map); -+ -+ if (!test_action_state(TOI_NO_MULTITHREADED_IO) && -+ (write || !toi_force_no_multithreaded)) -+ num_other_threads = start_other_threads(); -+ -+ if (!num_other_threads || !toiActiveAllocator->io_flusher || -+ test_action_state(TOI_NO_FLUSHER_THREAD)) { -+ atomic_inc(&toi_io_workers); -+ worker_rw_loop(num_other_threads ? NULL : MONITOR); -+ } else -+ result = toiActiveAllocator->io_flusher(write); -+ -+ while (atomic_read(&toi_io_workers)) -+ schedule(); -+ -+ if (unlikely(test_toi_state(TOI_STOP_RESUME))) { -+ if (!atomic_read(&toi_io_workers)) { -+ rw_cleanup_modules(READ); -+ set_toi_state(TOI_IO_STOPPED); -+ } -+ while (1) -+ schedule(); -+ } -+ set_toi_state(TOI_IO_STOPPED); -+ -+ if (!io_result && !result && !test_result_state(TOI_ABORTED)) { -+ unsigned long next; -+ -+ toi_update_status(io_base + io_finish_at, io_barmax, -+ " %d/%d MB ", -+ MB(io_base + io_finish_at), MB(io_barmax)); -+ -+ memory_bm_position_reset(io_map); -+ next = memory_bm_next_pfn(io_map); -+ if (next != BM_END_OF_MAP) { -+ printk(KERN_INFO "Finished I/O loop but still work to " -+ "do?\nFinish at = %d. io_count = %d.\n", -+ finish_at, atomic_read(&io_count)); -+ printk(KERN_INFO "I/O bitmap still records work to do." -+ "%ld.\n", next); -+ do { -+ cpu_relax(); -+ } while (0); -+ } -+ } -+ -+ return io_result ? io_result : result; -+} -+ -+/** -+ * write_pageset - write a pageset to disk. -+ * @pagedir: Which pagedir to write. -+ * -+ * Returns: -+ * Zero on success or -1 on failure. -+ **/ -+int write_pageset(struct pagedir *pagedir) -+{ -+ int finish_at, base = 0; -+ int barmax = pagedir1.size + pagedir2.size; -+ long error = 0; -+ struct memory_bitmap *pageflags; -+ unsigned long start_time, end_time; -+ -+ /* -+ * Even if there is nothing to read or write, the allocator -+ * may need the init/cleanup for it's housekeeping. (eg: -+ * Pageset1 may start where pageset2 ends when writing). -+ */ -+ finish_at = pagedir->size; -+ -+ if (pagedir->id == 1) { -+ toi_prepare_status(DONT_CLEAR_BAR, -+ "Writing kernel & process data..."); -+ base = pagedir2.size; -+ if (test_action_state(TOI_TEST_FILTER_SPEED) || -+ test_action_state(TOI_TEST_BIO)) -+ pageflags = pageset1_map; -+ else -+ pageflags = pageset1_copy_map; -+ } else { -+ toi_prepare_status(DONT_CLEAR_BAR, "Writing caches..."); -+ pageflags = pageset2_map; -+ } -+ -+ start_time = jiffies; -+ -+ if (rw_init_modules(1, pagedir->id)) { -+ abort_hibernate(TOI_FAILED_MODULE_INIT, -+ "Failed to initialise modules for writing."); -+ error = 1; -+ } -+ -+ if (!error) -+ error = do_rw_loop(1, finish_at, pageflags, base, barmax, -+ pagedir->id); -+ -+ if (rw_cleanup_modules(WRITE) && !error) { -+ abort_hibernate(TOI_FAILED_MODULE_CLEANUP, -+ "Failed to cleanup after writing."); -+ error = 1; -+ } -+ -+ end_time = jiffies; -+ -+ if ((end_time - start_time) && (!test_result_state(TOI_ABORTED))) { -+ toi_bkd.toi_io_time[0][0] += finish_at, -+ toi_bkd.toi_io_time[0][1] += (end_time - start_time); -+ } -+ -+ return error; -+} -+ -+/** -+ * read_pageset - highlevel function to read a pageset from disk -+ * @pagedir: pageset to read -+ * @overwrittenpagesonly: Whether to read the whole pageset or -+ * only part of it. -+ * -+ * Returns: -+ * Zero on success or -1 on failure. -+ **/ -+static int read_pageset(struct pagedir *pagedir, int overwrittenpagesonly) -+{ -+ int result = 0, base = 0; -+ int finish_at = pagedir->size; -+ int barmax = pagedir1.size + pagedir2.size; -+ struct memory_bitmap *pageflags; -+ unsigned long start_time, end_time; -+ -+ if (pagedir->id == 1) { -+ toi_prepare_status(DONT_CLEAR_BAR, -+ "Reading kernel & process data..."); -+ pageflags = pageset1_map; -+ } else { -+ toi_prepare_status(DONT_CLEAR_BAR, "Reading caches..."); -+ if (overwrittenpagesonly) { -+ barmax = min(pagedir1.size, pagedir2.size); -+ finish_at = min(pagedir1.size, pagedir2.size); -+ } else -+ base = pagedir1.size; -+ pageflags = pageset2_map; -+ } -+ -+ start_time = jiffies; -+ -+ if (rw_init_modules(0, pagedir->id)) { -+ toiActiveAllocator->remove_image(); -+ result = 1; -+ } else -+ result = do_rw_loop(0, finish_at, pageflags, base, barmax, -+ pagedir->id); -+ -+ if (rw_cleanup_modules(READ) && !result) { -+ abort_hibernate(TOI_FAILED_MODULE_CLEANUP, -+ "Failed to cleanup after reading."); -+ result = 1; -+ } -+ -+ /* Statistics */ -+ end_time = jiffies; -+ -+ if ((end_time - start_time) && (!test_result_state(TOI_ABORTED))) { -+ toi_bkd.toi_io_time[1][0] += finish_at, -+ toi_bkd.toi_io_time[1][1] += (end_time - start_time); -+ } -+ -+ return result; -+} -+ -+/** -+ * write_module_configs - store the modules configuration -+ * -+ * The configuration for each module is stored in the image header. -+ * Returns: Int -+ * Zero on success, Error value otherwise. -+ **/ -+static int write_module_configs(void) -+{ -+ struct toi_module_ops *this_module; -+ char *buffer = (char *) toi_get_zeroed_page(22, TOI_ATOMIC_GFP); -+ int len, index = 1; -+ struct toi_module_header toi_module_header; -+ -+ if (!buffer) { -+ printk(KERN_INFO "Failed to allocate a buffer for saving " -+ "module configuration info.\n"); -+ return -ENOMEM; -+ } -+ -+ /* -+ * We have to know which data goes with which module, so we at -+ * least write a length of zero for a module. Note that we are -+ * also assuming every module's config data takes <= PAGE_SIZE. -+ */ -+ -+ /* For each module (in registration order) */ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || !this_module->storage_needed || -+ (this_module->type == WRITER_MODULE && -+ toiActiveAllocator != this_module)) -+ continue; -+ -+ /* Get the data from the module */ -+ len = 0; -+ if (this_module->save_config_info) -+ len = this_module->save_config_info(buffer); -+ -+ /* Save the details of the module */ -+ toi_module_header.enabled = this_module->enabled; -+ toi_module_header.type = this_module->type; -+ toi_module_header.index = index++; -+ strncpy(toi_module_header.name, this_module->name, -+ sizeof(toi_module_header.name)); -+ toiActiveAllocator->rw_header_chunk(WRITE, -+ this_module, -+ (char *) &toi_module_header, -+ sizeof(toi_module_header)); -+ -+ /* Save the size of the data and any data returned */ -+ toiActiveAllocator->rw_header_chunk(WRITE, -+ this_module, -+ (char *) &len, sizeof(int)); -+ if (len) -+ toiActiveAllocator->rw_header_chunk( -+ WRITE, this_module, buffer, len); -+ } -+ -+ /* Write a blank header to terminate the list */ -+ toi_module_header.name[0] = '\0'; -+ toiActiveAllocator->rw_header_chunk(WRITE, NULL, -+ (char *) &toi_module_header, sizeof(toi_module_header)); -+ -+ toi_free_page(22, (unsigned long) buffer); -+ return 0; -+} -+ -+/** -+ * read_one_module_config - read and configure one module -+ * -+ * Read the configuration for one module, and configure the module -+ * to match if it is loaded. -+ * -+ * Returns: Int -+ * Zero on success, Error value otherwise. -+ **/ -+static int read_one_module_config(struct toi_module_header *header) -+{ -+ struct toi_module_ops *this_module; -+ int result, len; -+ char *buffer; -+ -+ /* Find the module */ -+ this_module = toi_find_module_given_name(header->name); -+ -+ if (!this_module) { -+ if (header->enabled) { -+ toi_early_boot_message(1, TOI_CONTINUE_REQ, -+ "It looks like we need module %s for reading " -+ "the image but it hasn't been registered.\n", -+ header->name); -+ if (!(test_toi_state(TOI_CONTINUE_REQ))) -+ return -EINVAL; -+ } else -+ printk(KERN_INFO "Module %s configuration data found, " -+ "but the module hasn't registered. Looks like " -+ "it was disabled, so we're ignoring its data.", -+ header->name); -+ } -+ -+ /* Get the length of the data (if any) */ -+ result = toiActiveAllocator->rw_header_chunk(READ, NULL, (char *) &len, -+ sizeof(int)); -+ if (result) { -+ printk(KERN_ERR "Failed to read the length of the module %s's" -+ " configuration data.\n", -+ header->name); -+ return -EINVAL; -+ } -+ -+ /* Read any data and pass to the module (if we found one) */ -+ if (!len) -+ return 0; -+ -+ buffer = (char *) toi_get_zeroed_page(23, TOI_ATOMIC_GFP); -+ -+ if (!buffer) { -+ printk(KERN_ERR "Failed to allocate a buffer for reloading " -+ "module configuration info.\n"); -+ return -ENOMEM; -+ } -+ -+ toiActiveAllocator->rw_header_chunk(READ, NULL, buffer, len); -+ -+ if (!this_module) -+ goto out; -+ -+ if (!this_module->save_config_info) -+ printk(KERN_ERR "Huh? Module %s appears to have a " -+ "save_config_info, but not a load_config_info " -+ "function!\n", this_module->name); -+ else -+ this_module->load_config_info(buffer, len); -+ -+ /* -+ * Now move this module to the tail of its lists. This will put it in -+ * order. Any new modules will end up at the top of the lists. They -+ * should have been set to disabled when loaded (people will -+ * normally not edit an initrd to load a new module and then hibernate -+ * without using it!). -+ */ -+ -+ toi_move_module_tail(this_module); -+ -+ this_module->enabled = header->enabled; -+ -+out: -+ toi_free_page(23, (unsigned long) buffer); -+ return 0; -+} -+ -+/** -+ * read_module_configs - reload module configurations from the image header. -+ * -+ * Returns: Int -+ * Zero on success or an error code. -+ **/ -+static int read_module_configs(void) -+{ -+ int result = 0; -+ struct toi_module_header toi_module_header; -+ struct toi_module_ops *this_module; -+ -+ /* All modules are initially disabled. That way, if we have a module -+ * loaded now that wasn't loaded when we hibernated, it won't be used -+ * in trying to read the data. -+ */ -+ list_for_each_entry(this_module, &toi_modules, module_list) -+ this_module->enabled = 0; -+ -+ /* Get the first module header */ -+ result = toiActiveAllocator->rw_header_chunk(READ, NULL, -+ (char *) &toi_module_header, -+ sizeof(toi_module_header)); -+ if (result) { -+ printk(KERN_ERR "Failed to read the next module header.\n"); -+ return -EINVAL; -+ } -+ -+ /* For each module (in registration order) */ -+ while (toi_module_header.name[0]) { -+ result = read_one_module_config(&toi_module_header); -+ -+ if (result) -+ return -EINVAL; -+ -+ /* Get the next module header */ -+ result = toiActiveAllocator->rw_header_chunk(READ, NULL, -+ (char *) &toi_module_header, -+ sizeof(toi_module_header)); -+ -+ if (result) { -+ printk(KERN_ERR "Failed to read the next module " -+ "header.\n"); -+ return -EINVAL; -+ } -+ } -+ -+ return 0; -+} -+ -+static inline int save_fs_info(struct fs_info *fs, struct block_device *bdev) -+{ -+ return (!fs || IS_ERR(fs) || !fs->last_mount_size) ? 0 : 1; -+} -+ -+int fs_info_space_needed(void) -+{ -+ const struct super_block *sb; -+ int result = sizeof(int); -+ -+ list_for_each_entry(sb, &super_blocks, s_list) { -+ struct fs_info *fs; -+ -+ if (!sb->s_bdev) -+ continue; -+ -+ fs = fs_info_from_block_dev(sb->s_bdev); -+ if (save_fs_info(fs, sb->s_bdev)) -+ result += 16 + sizeof(int) + fs->last_mount_size; -+ free_fs_info(fs); -+ } -+ return result; -+} -+ -+static int fs_info_num_to_save(void) -+{ -+ const struct super_block *sb; -+ int to_save = 0; -+ -+ list_for_each_entry(sb, &super_blocks, s_list) { -+ struct fs_info *fs; -+ -+ if (!sb->s_bdev) -+ continue; -+ -+ fs = fs_info_from_block_dev(sb->s_bdev); -+ if (save_fs_info(fs, sb->s_bdev)) -+ to_save++; -+ free_fs_info(fs); -+ } -+ -+ return to_save; -+} -+ -+static int fs_info_save(void) -+{ -+ const struct super_block *sb; -+ int to_save = fs_info_num_to_save(); -+ -+ if (toiActiveAllocator->rw_header_chunk(WRITE, NULL, (char *) &to_save, -+ sizeof(int))) { -+ abort_hibernate(TOI_FAILED_IO, "Failed to write num fs_info" -+ " to save."); -+ return -EIO; -+ } -+ -+ list_for_each_entry(sb, &super_blocks, s_list) { -+ struct fs_info *fs; -+ -+ if (!sb->s_bdev) -+ continue; -+ -+ fs = fs_info_from_block_dev(sb->s_bdev); -+ if (save_fs_info(fs, sb->s_bdev)) { -+ if (toiActiveAllocator->rw_header_chunk(WRITE, NULL, -+ &fs->uuid[0], 16)) { -+ abort_hibernate(TOI_FAILED_IO, "Failed to " -+ "write uuid."); -+ return -EIO; -+ } -+ if (toiActiveAllocator->rw_header_chunk(WRITE, NULL, -+ (char *) &fs->last_mount_size, sizeof(int))) { -+ abort_hibernate(TOI_FAILED_IO, "Failed to " -+ "write last mount length."); -+ return -EIO; -+ } -+ if (toiActiveAllocator->rw_header_chunk(WRITE, NULL, -+ fs->last_mount, fs->last_mount_size)) { -+ abort_hibernate(TOI_FAILED_IO, "Failed to " -+ "write uuid."); -+ return -EIO; -+ } -+ } -+ free_fs_info(fs); -+ } -+ return 0; -+} -+ -+static int fs_info_load_and_check_one(void) -+{ -+ char uuid[16], *last_mount; -+ int result = 0, ln; -+ dev_t dev_t; -+ struct block_device *dev; -+ struct fs_info *fs_info; -+ -+ if (toiActiveAllocator->rw_header_chunk(READ, NULL, uuid, 16)) { -+ abort_hibernate(TOI_FAILED_IO, "Failed to read uuid."); -+ return -EIO; -+ } -+ -+ if (toiActiveAllocator->rw_header_chunk(READ, NULL, (char *) &ln, -+ sizeof(int))) { -+ abort_hibernate(TOI_FAILED_IO, -+ "Failed to read last mount size."); -+ return -EIO; -+ } -+ -+ last_mount = kzalloc(ln, GFP_KERNEL); -+ -+ if (!last_mount) -+ return -ENOMEM; -+ -+ if (toiActiveAllocator->rw_header_chunk(READ, NULL, last_mount, ln)) { -+ abort_hibernate(TOI_FAILED_IO, -+ "Failed to read last mount timestamp."); -+ result = -EIO; -+ goto out_lmt; -+ } -+ -+ dev_t = blk_lookup_uuid(uuid); -+ if (!dev_t) -+ goto out_lmt; -+ -+ dev = toi_open_by_devnum(dev_t); -+ -+ fs_info = fs_info_from_block_dev(dev); -+ if (fs_info && !IS_ERR(fs_info)) { -+ if (ln != fs_info->last_mount_size) { -+ printk(KERN_EMERG "Found matching uuid but last mount " -+ "time lengths differ?! " -+ "(%d vs %d).\n", ln, -+ fs_info->last_mount_size); -+ result = -EINVAL; -+ } else { -+ char buf[BDEVNAME_SIZE]; -+ result = !!memcmp(fs_info->last_mount, last_mount, ln); -+ if (result) -+ printk(KERN_EMERG "Last mount time for %s has " -+ "changed!\n", bdevname(dev, buf)); -+ } -+ } -+ toi_close_bdev(dev); -+ free_fs_info(fs_info); -+out_lmt: -+ kfree(last_mount); -+ return result; -+} -+ -+static int fs_info_load_and_check(void) -+{ -+ int to_do, result; -+ -+ if (toiActiveAllocator->rw_header_chunk(READ, NULL, (char *) &to_do, -+ sizeof(int))) { -+ abort_hibernate(TOI_FAILED_IO, "Failed to read num fs_info " -+ "to load."); -+ return -EIO; -+ } -+ -+ while(to_do--) -+ result |= fs_info_load_and_check_one(); -+ -+ return result; -+} -+ -+/** -+ * write_image_header - write the image header after write the image proper -+ * -+ * Returns: Int -+ * Zero on success, error value otherwise. -+ **/ -+int write_image_header(void) -+{ -+ int ret; -+ int total = pagedir1.size + pagedir2.size+2; -+ char *header_buffer = NULL; -+ -+ /* Now prepare to write the header */ -+ ret = toiActiveAllocator->write_header_init(); -+ if (ret) { -+ abort_hibernate(TOI_FAILED_MODULE_INIT, -+ "Active allocator's write_header_init" -+ " function failed."); -+ goto write_image_header_abort; -+ } -+ -+ /* Get a buffer */ -+ header_buffer = (char *) toi_get_zeroed_page(24, TOI_ATOMIC_GFP); -+ if (!header_buffer) { -+ abort_hibernate(TOI_OUT_OF_MEMORY, -+ "Out of memory when trying to get page for header!"); -+ goto write_image_header_abort; -+ } -+ -+ /* Write hibernate header */ -+ if (fill_toi_header((struct toi_header *) header_buffer)) { -+ abort_hibernate(TOI_OUT_OF_MEMORY, -+ "Failure to fill header information!"); -+ goto write_image_header_abort; -+ } -+ -+ if (toiActiveAllocator->rw_header_chunk(WRITE, NULL, -+ header_buffer, sizeof(struct toi_header))) { -+ abort_hibernate(TOI_OUT_OF_MEMORY, -+ "Failure to write header info."); -+ goto write_image_header_abort; -+ } -+ -+ if (toiActiveAllocator->rw_header_chunk(WRITE, NULL, -+ (char *) &toi_max_workers, sizeof(toi_max_workers))) { -+ abort_hibernate(TOI_OUT_OF_MEMORY, -+ "Failure to number of workers to use."); -+ goto write_image_header_abort; -+ } -+ -+ /* Write filesystem info */ -+ if (fs_info_save()) -+ goto write_image_header_abort; -+ -+ /* Write module configurations */ -+ ret = write_module_configs(); -+ if (ret) { -+ abort_hibernate(TOI_FAILED_IO, -+ "Failed to write module configs."); -+ goto write_image_header_abort; -+ } -+ -+ if (memory_bm_write(pageset1_map, -+ toiActiveAllocator->rw_header_chunk)) { -+ abort_hibernate(TOI_FAILED_IO, -+ "Failed to write bitmaps."); -+ goto write_image_header_abort; -+ } -+ -+ /* Flush data and let allocator cleanup */ -+ if (toiActiveAllocator->write_header_cleanup()) { -+ abort_hibernate(TOI_FAILED_IO, -+ "Failed to cleanup writing header."); -+ goto write_image_header_abort_no_cleanup; -+ } -+ -+ if (test_result_state(TOI_ABORTED)) -+ goto write_image_header_abort_no_cleanup; -+ -+ toi_update_status(total, total, NULL); -+ -+out: -+ if (header_buffer) -+ toi_free_page(24, (unsigned long) header_buffer); -+ return ret; -+ -+write_image_header_abort: -+ toiActiveAllocator->write_header_cleanup(); -+write_image_header_abort_no_cleanup: -+ ret = -1; -+ goto out; -+} -+ -+/** -+ * sanity_check - check the header -+ * @sh: the header which was saved at hibernate time. -+ * -+ * Perform a few checks, seeking to ensure that the kernel being -+ * booted matches the one hibernated. They need to match so we can -+ * be _sure_ things will work. It is not absolutely impossible for -+ * resuming from a different kernel to work, just not assured. -+ **/ -+static char *sanity_check(struct toi_header *sh) -+{ -+ char *reason = check_image_kernel((struct swsusp_info *) sh); -+ -+ if (reason) -+ return reason; -+ -+ if (!test_action_state(TOI_IGNORE_ROOTFS)) { -+ const struct super_block *sb; -+ list_for_each_entry(sb, &super_blocks, s_list) { -+ if ((!(sb->s_flags & MS_RDONLY)) && -+ (sb->s_type->fs_flags & FS_REQUIRES_DEV)) -+ return "Device backed fs has been mounted " -+ "rw prior to resume or initrd/ramfs " -+ "is mounted rw."; -+ } -+ } -+ -+ return NULL; -+} -+ -+static DECLARE_WAIT_QUEUE_HEAD(freeze_wait); -+ -+#define FREEZE_IN_PROGRESS (~0) -+ -+static int freeze_result; -+ -+static void do_freeze(struct work_struct *dummy) -+{ -+ freeze_result = freeze_processes(); -+ wake_up(&freeze_wait); -+ trap_non_toi_io = 1; -+} -+ -+static DECLARE_WORK(freeze_work, do_freeze); -+ -+/** -+ * __read_pageset1 - test for the existence of an image and attempt to load it -+ * -+ * Returns: Int -+ * Zero if image found and pageset1 successfully loaded. -+ * Error if no image found or loaded. -+ **/ -+static int __read_pageset1(void) -+{ -+ int i, result = 0; -+ char *header_buffer = (char *) toi_get_zeroed_page(25, TOI_ATOMIC_GFP), -+ *sanity_error = NULL; -+ struct toi_header *toi_header; -+ -+ if (!header_buffer) { -+ printk(KERN_INFO "Unable to allocate a page for reading the " -+ "signature.\n"); -+ return -ENOMEM; -+ } -+ -+ /* Check for an image */ -+ result = toiActiveAllocator->image_exists(1); -+ if (result == 3) { -+ result = -ENODATA; -+ toi_early_boot_message(1, 0, "The signature from an older " -+ "version of TuxOnIce has been detected."); -+ goto out_remove_image; -+ } -+ -+ if (result != 1) { -+ result = -ENODATA; -+ noresume_reset_modules(); -+ printk(KERN_INFO "TuxOnIce: No image found.\n"); -+ goto out; -+ } -+ -+ /* -+ * Prepare the active allocator for reading the image header. The -+ * activate allocator might read its own configuration. -+ * -+ * NB: This call may never return because there might be a signature -+ * for a different image such that we warn the user and they choose -+ * to reboot. (If the device ids look erroneous (2.4 vs 2.6) or the -+ * location of the image might be unavailable if it was stored on a -+ * network connection). -+ */ -+ -+ result = toiActiveAllocator->read_header_init(); -+ if (result) { -+ printk(KERN_INFO "TuxOnIce: Failed to initialise, reading the " -+ "image header.\n"); -+ goto out_remove_image; -+ } -+ -+ /* Check for noresume command line option */ -+ if (test_toi_state(TOI_NORESUME_SPECIFIED)) { -+ printk(KERN_INFO "TuxOnIce: Noresume on command line. Removed " -+ "image.\n"); -+ goto out_remove_image; -+ } -+ -+ /* Check whether we've resumed before */ -+ if (test_toi_state(TOI_RESUMED_BEFORE)) { -+ toi_early_boot_message(1, 0, NULL); -+ if (!(test_toi_state(TOI_CONTINUE_REQ))) { -+ printk(KERN_INFO "TuxOnIce: Tried to resume before: " -+ "Invalidated image.\n"); -+ goto out_remove_image; -+ } -+ } -+ -+ clear_toi_state(TOI_CONTINUE_REQ); -+ -+ toi_image_header_version = toiActiveAllocator->get_header_version(); -+ -+ if (unlikely(toi_image_header_version > TOI_HEADER_VERSION)) { -+ toi_early_boot_message(1, 0, image_version_error); -+ if (!(test_toi_state(TOI_CONTINUE_REQ))) { -+ printk(KERN_INFO "TuxOnIce: Header version too new: " -+ "Invalidated image.\n"); -+ goto out_remove_image; -+ } -+ } -+ -+ /* Read hibernate header */ -+ result = toiActiveAllocator->rw_header_chunk(READ, NULL, -+ header_buffer, sizeof(struct toi_header)); -+ if (result < 0) { -+ printk(KERN_ERR "TuxOnIce: Failed to read the image " -+ "signature.\n"); -+ goto out_remove_image; -+ } -+ -+ toi_header = (struct toi_header *) header_buffer; -+ -+ /* -+ * NB: This call may also result in a reboot rather than returning. -+ */ -+ -+ sanity_error = sanity_check(toi_header); -+ if (sanity_error) { -+ toi_early_boot_message(1, TOI_CONTINUE_REQ, -+ sanity_error); -+ printk(KERN_INFO "TuxOnIce: Sanity check failed.\n"); -+ goto out_remove_image; -+ } -+ -+ /* -+ * We have an image and it looks like it will load okay. -+ * -+ * Get metadata from header. Don't override commandline parameters. -+ * -+ * We don't need to save the image size limit because it's not used -+ * during resume and will be restored with the image anyway. -+ */ -+ -+ memcpy((char *) &pagedir1, -+ (char *) &toi_header->pagedir, sizeof(pagedir1)); -+ toi_result = toi_header->param0; -+ if (!toi_bkd.toi_debug_state) { -+ toi_bkd.toi_action = toi_header->param1; -+ toi_bkd.toi_debug_state = toi_header->param2; -+ toi_bkd.toi_default_console_level = toi_header->param3; -+ } -+ clear_toi_state(TOI_IGNORE_LOGLEVEL); -+ pagedir2.size = toi_header->pageset_2_size; -+ for (i = 0; i < 4; i++) -+ toi_bkd.toi_io_time[i/2][i%2] = -+ toi_header->io_time[i/2][i%2]; -+ -+ set_toi_state(TOI_BOOT_KERNEL); -+ boot_kernel_data_buffer = toi_header->bkd; -+ -+ read_if_version(1, toi_max_workers, "TuxOnIce max workers"); -+ -+ /* Read filesystem info */ -+ if (fs_info_load_and_check()) { -+ printk(KERN_EMERG "TuxOnIce: File system mount time checks " -+ "failed. Refusing to corrupt your filesystems!\n"); -+ goto out_remove_image; -+ } -+ -+ /* Read module configurations */ -+ result = read_module_configs(); -+ if (result) { -+ pagedir1.size = 0; -+ pagedir2.size = 0; -+ printk(KERN_INFO "TuxOnIce: Failed to read TuxOnIce module " -+ "configurations.\n"); -+ clear_action_state(TOI_KEEP_IMAGE); -+ goto out_remove_image; -+ } -+ -+ toi_prepare_console(); -+ -+ set_toi_state(TOI_NOW_RESUMING); -+ -+ if (!test_action_state(TOI_LATE_CPU_HOTPLUG)) { -+ toi_prepare_status(DONT_CLEAR_BAR, "Disable nonboot cpus."); -+ if (disable_nonboot_cpus()) { -+ set_abort_result(TOI_CPU_HOTPLUG_FAILED); -+ goto out_reset_console; -+ } -+ } -+ -+ if (usermodehelper_disable()) -+ goto out_enable_nonboot_cpus; -+ -+ current->flags |= PF_NOFREEZE; -+ freeze_result = FREEZE_IN_PROGRESS; -+ -+ schedule_work_on(first_cpu(cpu_online_map), &freeze_work); -+ -+ toi_cond_pause(1, "About to read original pageset1 locations."); -+ -+ /* -+ * See _toi_rw_header_chunk in tuxonice_bio.c: -+ * Initialize pageset1_map by reading the map from the image. -+ */ -+ if (memory_bm_read(pageset1_map, toiActiveAllocator->rw_header_chunk)) -+ goto out_thaw; -+ -+ /* -+ * See toi_rw_cleanup in tuxonice_bio.c: -+ * Clean up after reading the header. -+ */ -+ result = toiActiveAllocator->read_header_cleanup(); -+ if (result) { -+ printk(KERN_ERR "TuxOnIce: Failed to cleanup after reading the " -+ "image header.\n"); -+ goto out_thaw; -+ } -+ -+ toi_cond_pause(1, "About to read pagedir."); -+ -+ /* -+ * Get the addresses of pages into which we will load the kernel to -+ * be copied back and check if they conflict with the ones we are using. -+ */ -+ if (toi_get_pageset1_load_addresses()) { -+ printk(KERN_INFO "TuxOnIce: Failed to get load addresses for " -+ "pageset1.\n"); -+ goto out_thaw; -+ } -+ -+ /* Read the original kernel back */ -+ toi_cond_pause(1, "About to read pageset 1."); -+ -+ /* Given the pagemap, read back the data from disk */ -+ if (read_pageset(&pagedir1, 0)) { -+ toi_prepare_status(DONT_CLEAR_BAR, "Failed to read pageset 1."); -+ result = -EIO; -+ goto out_thaw; -+ } -+ -+ toi_cond_pause(1, "About to restore original kernel."); -+ result = 0; -+ -+ if (!test_action_state(TOI_KEEP_IMAGE) && -+ toiActiveAllocator->mark_resume_attempted) -+ toiActiveAllocator->mark_resume_attempted(1); -+ -+ wait_event(freeze_wait, freeze_result != FREEZE_IN_PROGRESS); -+out: -+ current->flags &= ~PF_NOFREEZE; -+ toi_free_page(25, (unsigned long) header_buffer); -+ return result; -+ -+out_thaw: -+ wait_event(freeze_wait, freeze_result != FREEZE_IN_PROGRESS); -+ trap_non_toi_io = 0; -+ thaw_processes(); -+ usermodehelper_enable(); -+out_enable_nonboot_cpus: -+ enable_nonboot_cpus(); -+out_reset_console: -+ toi_cleanup_console(); -+out_remove_image: -+ result = -EINVAL; -+ if (!test_action_state(TOI_KEEP_IMAGE)) -+ toiActiveAllocator->remove_image(); -+ toiActiveAllocator->read_header_cleanup(); -+ noresume_reset_modules(); -+ goto out; -+} -+ -+/** -+ * read_pageset1 - highlevel function to read the saved pages -+ * -+ * Attempt to read the header and pageset1 of a hibernate image. -+ * Handle the outcome, complaining where appropriate. -+ **/ -+int read_pageset1(void) -+{ -+ int error; -+ -+ error = __read_pageset1(); -+ -+ if (error && error != -ENODATA && error != -EINVAL && -+ !test_result_state(TOI_ABORTED)) -+ abort_hibernate(TOI_IMAGE_ERROR, -+ "TuxOnIce: Error %d resuming\n", error); -+ -+ return error; -+} -+ -+/** -+ * get_have_image_data - check the image header -+ **/ -+static char *get_have_image_data(void) -+{ -+ char *output_buffer = (char *) toi_get_zeroed_page(26, TOI_ATOMIC_GFP); -+ struct toi_header *toi_header; -+ -+ if (!output_buffer) { -+ printk(KERN_INFO "Output buffer null.\n"); -+ return NULL; -+ } -+ -+ /* Check for an image */ -+ if (!toiActiveAllocator->image_exists(1) || -+ toiActiveAllocator->read_header_init() || -+ toiActiveAllocator->rw_header_chunk(READ, NULL, -+ output_buffer, sizeof(struct toi_header))) { -+ sprintf(output_buffer, "0\n"); -+ /* -+ * From an initrd/ramfs, catting have_image and -+ * getting a result of 0 is sufficient. -+ */ -+ clear_toi_state(TOI_BOOT_TIME); -+ goto out; -+ } -+ -+ toi_header = (struct toi_header *) output_buffer; -+ -+ sprintf(output_buffer, "1\n%s\n%s\n", -+ toi_header->uts.machine, -+ toi_header->uts.version); -+ -+ /* Check whether we've resumed before */ -+ if (test_toi_state(TOI_RESUMED_BEFORE)) -+ strcat(output_buffer, "Resumed before.\n"); -+ -+out: -+ noresume_reset_modules(); -+ return output_buffer; -+} -+ -+/** -+ * read_pageset2 - read second part of the image -+ * @overwrittenpagesonly: Read only pages which would have been -+ * verwritten by pageset1? -+ * -+ * Read in part or all of pageset2 of an image, depending upon -+ * whether we are hibernating and have only overwritten a portion -+ * with pageset1 pages, or are resuming and need to read them -+ * all. -+ * -+ * Returns: Int -+ * Zero if no error, otherwise the error value. -+ **/ -+int read_pageset2(int overwrittenpagesonly) -+{ -+ int result = 0; -+ -+ if (!pagedir2.size) -+ return 0; -+ -+ result = read_pageset(&pagedir2, overwrittenpagesonly); -+ -+ toi_cond_pause(1, "Pagedir 2 read."); -+ -+ return result; -+} -+ -+/** -+ * image_exists_read - has an image been found? -+ * @page: Output buffer -+ * -+ * Store 0 or 1 in page, depending on whether an image is found. -+ * Incoming buffer is PAGE_SIZE and result is guaranteed -+ * to be far less than that, so we don't worry about -+ * overflow. -+ **/ -+int image_exists_read(const char *page, int count) -+{ -+ int len = 0; -+ char *result; -+ -+ if (toi_activate_storage(0)) -+ return count; -+ -+ if (!test_toi_state(TOI_RESUME_DEVICE_OK)) -+ toi_attempt_to_parse_resume_device(0); -+ -+ if (!toiActiveAllocator) { -+ len = sprintf((char *) page, "-1\n"); -+ } else { -+ result = get_have_image_data(); -+ if (result) { -+ len = sprintf((char *) page, "%s", result); -+ toi_free_page(26, (unsigned long) result); -+ } -+ } -+ -+ toi_deactivate_storage(0); -+ -+ return len; -+} -+ -+/** -+ * image_exists_write - invalidate an image if one exists -+ **/ -+int image_exists_write(const char *buffer, int count) -+{ -+ if (toi_activate_storage(0)) -+ return count; -+ -+ if (toiActiveAllocator && toiActiveAllocator->image_exists(1)) -+ toiActiveAllocator->remove_image(); -+ -+ toi_deactivate_storage(0); -+ -+ clear_result_state(TOI_KEPT_IMAGE); -+ -+ return count; -+} -diff --git a/kernel/power/tuxonice_io.h b/kernel/power/tuxonice_io.h -new file mode 100644 -index 0000000..fe37713 ---- /dev/null -+++ b/kernel/power/tuxonice_io.h -@@ -0,0 +1,74 @@ -+/* -+ * kernel/power/tuxonice_io.h -+ * -+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * It contains high level IO routines for hibernating. -+ * -+ */ -+ -+#include -+#include "tuxonice_pagedir.h" -+ -+/* Non-module data saved in our image header */ -+struct toi_header { -+ /* -+ * Mirror struct swsusp_info, but without -+ * the page aligned attribute -+ */ -+ struct new_utsname uts; -+ u32 version_code; -+ unsigned long num_physpages; -+ int cpus; -+ unsigned long image_pages; -+ unsigned long pages; -+ unsigned long size; -+ -+ /* Our own data */ -+ unsigned long orig_mem_free; -+ int page_size; -+ int pageset_2_size; -+ int param0; -+ int param1; -+ int param2; -+ int param3; -+ int progress0; -+ int progress1; -+ int progress2; -+ int progress3; -+ int io_time[2][2]; -+ struct pagedir pagedir; -+ dev_t root_fs; -+ unsigned long bkd; /* Boot kernel data locn */ -+}; -+ -+extern int write_pageset(struct pagedir *pagedir); -+extern int write_image_header(void); -+extern int read_pageset1(void); -+extern int read_pageset2(int overwrittenpagesonly); -+ -+extern int toi_attempt_to_parse_resume_device(int quiet); -+extern void attempt_to_parse_resume_device2(void); -+extern void attempt_to_parse_alt_resume_param(void); -+int image_exists_read(const char *page, int count); -+int image_exists_write(const char *buffer, int count); -+extern void save_restore_alt_param(int replace, int quiet); -+extern atomic_t toi_io_workers; -+ -+/* Args to save_restore_alt_param */ -+#define RESTORE 0 -+#define SAVE 1 -+ -+#define NOQUIET 0 -+#define QUIET 1 -+ -+extern dev_t name_to_dev_t(char *line); -+ -+extern wait_queue_head_t toi_io_queue_flusher; -+extern int toi_bio_queue_flusher_should_finish; -+ -+int fs_info_space_needed(void); -+ -+extern int toi_max_workers; -diff --git a/kernel/power/tuxonice_modules.c b/kernel/power/tuxonice_modules.c -new file mode 100644 -index 0000000..4cc24a9 ---- /dev/null -+++ b/kernel/power/tuxonice_modules.c -@@ -0,0 +1,522 @@ -+/* -+ * kernel/power/tuxonice_modules.c -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ */ -+ -+#include -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_ui.h" -+ -+LIST_HEAD(toi_filters); -+LIST_HEAD(toiAllocators); -+ -+LIST_HEAD(toi_modules); -+EXPORT_SYMBOL_GPL(toi_modules); -+ -+struct toi_module_ops *toiActiveAllocator; -+EXPORT_SYMBOL_GPL(toiActiveAllocator); -+ -+static int toi_num_filters; -+int toiNumAllocators, toi_num_modules; -+ -+/* -+ * toi_header_storage_for_modules -+ * -+ * Returns the amount of space needed to store configuration -+ * data needed by the modules prior to copying back the original -+ * kernel. We can exclude data for pageset2 because it will be -+ * available anyway once the kernel is copied back. -+ */ -+long toi_header_storage_for_modules(void) -+{ -+ struct toi_module_ops *this_module; -+ int bytes = 0; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || -+ (this_module->type == WRITER_MODULE && -+ toiActiveAllocator != this_module)) -+ continue; -+ if (this_module->storage_needed) { -+ int this = this_module->storage_needed() + -+ sizeof(struct toi_module_header) + -+ sizeof(int); -+ this_module->header_requested = this; -+ bytes += this; -+ } -+ } -+ -+ /* One more for the empty terminator */ -+ return bytes + sizeof(struct toi_module_header); -+} -+ -+void print_toi_header_storage_for_modules(void) -+{ -+ struct toi_module_ops *this_module; -+ int bytes = 0; -+ -+ printk(KERN_DEBUG "Header storage:\n"); -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || -+ (this_module->type == WRITER_MODULE && -+ toiActiveAllocator != this_module)) -+ continue; -+ if (this_module->storage_needed) { -+ int this = this_module->storage_needed() + -+ sizeof(struct toi_module_header) + -+ sizeof(int); -+ this_module->header_requested = this; -+ bytes += this; -+ printk(KERN_DEBUG "+ %16s : %-4d/%d.\n", -+ this_module->name, -+ this_module->header_used, this); -+ } -+ } -+ -+ printk(KERN_DEBUG "+ empty terminator : %zu.\n", -+ sizeof(struct toi_module_header)); -+ printk(KERN_DEBUG " ====\n"); -+ printk(KERN_DEBUG " %zu\n", -+ bytes + sizeof(struct toi_module_header)); -+} -+EXPORT_SYMBOL_GPL(print_toi_header_storage_for_modules); -+ -+/* -+ * toi_memory_for_modules -+ * -+ * Returns the amount of memory requested by modules for -+ * doing their work during the cycle. -+ */ -+ -+long toi_memory_for_modules(int print_parts) -+{ -+ long bytes = 0, result; -+ struct toi_module_ops *this_module; -+ -+ if (print_parts) -+ printk(KERN_INFO "Memory for modules:\n===================\n"); -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ int this; -+ if (!this_module->enabled) -+ continue; -+ if (this_module->memory_needed) { -+ this = this_module->memory_needed(); -+ if (print_parts) -+ printk(KERN_INFO "%10d bytes (%5ld pages) for " -+ "module '%s'.\n", this, -+ DIV_ROUND_UP(this, PAGE_SIZE), -+ this_module->name); -+ bytes += this; -+ } -+ } -+ -+ result = DIV_ROUND_UP(bytes, PAGE_SIZE); -+ if (print_parts) -+ printk(KERN_INFO " => %ld bytes, %ld pages.\n", bytes, result); -+ -+ return result; -+} -+ -+/* -+ * toi_expected_compression_ratio -+ * -+ * Returns the compression ratio expected when saving the image. -+ */ -+ -+int toi_expected_compression_ratio(void) -+{ -+ int ratio = 100; -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled) -+ continue; -+ if (this_module->expected_compression) -+ ratio = ratio * this_module->expected_compression() -+ / 100; -+ } -+ -+ return ratio; -+} -+ -+/* toi_find_module_given_dir -+ * Functionality : Return a module (if found), given a pointer -+ * to its directory name -+ */ -+ -+static struct toi_module_ops *toi_find_module_given_dir(char *name) -+{ -+ struct toi_module_ops *this_module, *found_module = NULL; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!strcmp(name, this_module->directory)) { -+ found_module = this_module; -+ break; -+ } -+ } -+ -+ return found_module; -+} -+ -+/* toi_find_module_given_name -+ * Functionality : Return a module (if found), given a pointer -+ * to its name -+ */ -+ -+struct toi_module_ops *toi_find_module_given_name(char *name) -+{ -+ struct toi_module_ops *this_module, *found_module = NULL; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!strcmp(name, this_module->name)) { -+ found_module = this_module; -+ break; -+ } -+ } -+ -+ return found_module; -+} -+ -+/* -+ * toi_print_module_debug_info -+ * Functionality : Get debugging info from modules into a buffer. -+ */ -+int toi_print_module_debug_info(char *buffer, int buffer_size) -+{ -+ struct toi_module_ops *this_module; -+ int len = 0; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled) -+ continue; -+ if (this_module->print_debug_info) { -+ int result; -+ result = this_module->print_debug_info(buffer + len, -+ buffer_size - len); -+ len += result; -+ } -+ } -+ -+ /* Ensure null terminated */ -+ buffer[buffer_size] = 0; -+ -+ return len; -+} -+ -+/* -+ * toi_register_module -+ * -+ * Register a module. -+ */ -+int toi_register_module(struct toi_module_ops *module) -+{ -+ int i; -+ struct kobject *kobj; -+ -+ module->enabled = 1; -+ -+ if (toi_find_module_given_name(module->name)) { -+ printk(KERN_INFO "TuxOnIce: Trying to load module %s," -+ " which is already registered.\n", -+ module->name); -+ return -EBUSY; -+ } -+ -+ switch (module->type) { -+ case FILTER_MODULE: -+ list_add_tail(&module->type_list, &toi_filters); -+ toi_num_filters++; -+ break; -+ case WRITER_MODULE: -+ list_add_tail(&module->type_list, &toiAllocators); -+ toiNumAllocators++; -+ break; -+ case MISC_MODULE: -+ case MISC_HIDDEN_MODULE: -+ case BIO_ALLOCATOR_MODULE: -+ break; -+ default: -+ printk(KERN_ERR "Hmmm. Module '%s' has an invalid type." -+ " It has been ignored.\n", module->name); -+ return -EINVAL; -+ } -+ list_add_tail(&module->module_list, &toi_modules); -+ toi_num_modules++; -+ -+ if ((!module->directory && !module->shared_directory) || -+ !module->sysfs_data || !module->num_sysfs_entries) -+ return 0; -+ -+ /* -+ * Modules may share a directory, but those with shared_dir -+ * set must be loaded (via symbol dependencies) after parents -+ * and unloaded beforehand. -+ */ -+ if (module->shared_directory) { -+ struct toi_module_ops *shared = -+ toi_find_module_given_dir(module->shared_directory); -+ if (!shared) { -+ printk(KERN_ERR "TuxOnIce: Module %s wants to share " -+ "%s's directory but %s isn't loaded.\n", -+ module->name, module->shared_directory, -+ module->shared_directory); -+ toi_unregister_module(module); -+ return -ENODEV; -+ } -+ kobj = shared->dir_kobj; -+ } else { -+ if (!strncmp(module->directory, "[ROOT]", 6)) -+ kobj = tuxonice_kobj; -+ else -+ kobj = make_toi_sysdir(module->directory); -+ } -+ module->dir_kobj = kobj; -+ for (i = 0; i < module->num_sysfs_entries; i++) { -+ int result = toi_register_sysfs_file(kobj, -+ &module->sysfs_data[i]); -+ if (result) -+ return result; -+ } -+ return 0; -+} -+EXPORT_SYMBOL_GPL(toi_register_module); -+ -+/* -+ * toi_unregister_module -+ * -+ * Remove a module. -+ */ -+void toi_unregister_module(struct toi_module_ops *module) -+{ -+ int i; -+ -+ if (module->dir_kobj) -+ for (i = 0; i < module->num_sysfs_entries; i++) -+ toi_unregister_sysfs_file(module->dir_kobj, -+ &module->sysfs_data[i]); -+ -+ if (!module->shared_directory && module->directory && -+ strncmp(module->directory, "[ROOT]", 6)) -+ remove_toi_sysdir(module->dir_kobj); -+ -+ switch (module->type) { -+ case FILTER_MODULE: -+ list_del(&module->type_list); -+ toi_num_filters--; -+ break; -+ case WRITER_MODULE: -+ list_del(&module->type_list); -+ toiNumAllocators--; -+ if (toiActiveAllocator == module) { -+ toiActiveAllocator = NULL; -+ clear_toi_state(TOI_CAN_RESUME); -+ clear_toi_state(TOI_CAN_HIBERNATE); -+ } -+ break; -+ case MISC_MODULE: -+ case MISC_HIDDEN_MODULE: -+ case BIO_ALLOCATOR_MODULE: -+ break; -+ default: -+ printk(KERN_ERR "Module '%s' has an invalid type." -+ " It has been ignored.\n", module->name); -+ return; -+ } -+ list_del(&module->module_list); -+ toi_num_modules--; -+} -+EXPORT_SYMBOL_GPL(toi_unregister_module); -+ -+/* -+ * toi_move_module_tail -+ * -+ * Rearrange modules when reloading the config. -+ */ -+void toi_move_module_tail(struct toi_module_ops *module) -+{ -+ switch (module->type) { -+ case FILTER_MODULE: -+ if (toi_num_filters > 1) -+ list_move_tail(&module->type_list, &toi_filters); -+ break; -+ case WRITER_MODULE: -+ if (toiNumAllocators > 1) -+ list_move_tail(&module->type_list, &toiAllocators); -+ break; -+ case MISC_MODULE: -+ case MISC_HIDDEN_MODULE: -+ case BIO_ALLOCATOR_MODULE: -+ break; -+ default: -+ printk(KERN_ERR "Module '%s' has an invalid type." -+ " It has been ignored.\n", module->name); -+ return; -+ } -+ if ((toi_num_filters + toiNumAllocators) > 1) -+ list_move_tail(&module->module_list, &toi_modules); -+} -+ -+/* -+ * toi_initialise_modules -+ * -+ * Get ready to do some work! -+ */ -+int toi_initialise_modules(int starting_cycle, int early) -+{ -+ struct toi_module_ops *this_module; -+ int result; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ this_module->header_requested = 0; -+ this_module->header_used = 0; -+ if (!this_module->enabled) -+ continue; -+ if (this_module->early != early) -+ continue; -+ if (this_module->initialise) { -+ result = this_module->initialise(starting_cycle); -+ if (result) { -+ toi_cleanup_modules(starting_cycle); -+ return result; -+ } -+ this_module->initialised = 1; -+ } -+ } -+ -+ return 0; -+} -+ -+/* -+ * toi_cleanup_modules -+ * -+ * Tell modules the work is done. -+ */ -+void toi_cleanup_modules(int finishing_cycle) -+{ -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (!this_module->enabled || !this_module->initialised) -+ continue; -+ if (this_module->cleanup) -+ this_module->cleanup(finishing_cycle); -+ this_module->initialised = 0; -+ } -+} -+ -+/* -+ * toi_pre_atomic_restore_modules -+ * -+ * Get ready to do some work! -+ */ -+void toi_pre_atomic_restore_modules(struct toi_boot_kernel_data *bkd) -+{ -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (this_module->enabled && this_module->pre_atomic_restore) -+ this_module->pre_atomic_restore(bkd); -+ } -+} -+ -+/* -+ * toi_post_atomic_restore_modules -+ * -+ * Get ready to do some work! -+ */ -+void toi_post_atomic_restore_modules(struct toi_boot_kernel_data *bkd) -+{ -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (this_module->enabled && this_module->post_atomic_restore) -+ this_module->post_atomic_restore(bkd); -+ } -+} -+ -+/* -+ * toi_get_next_filter -+ * -+ * Get the next filter in the pipeline. -+ */ -+struct toi_module_ops *toi_get_next_filter(struct toi_module_ops *filter_sought) -+{ -+ struct toi_module_ops *last_filter = NULL, *this_filter = NULL; -+ -+ list_for_each_entry(this_filter, &toi_filters, type_list) { -+ if (!this_filter->enabled) -+ continue; -+ if ((last_filter == filter_sought) || (!filter_sought)) -+ return this_filter; -+ last_filter = this_filter; -+ } -+ -+ return toiActiveAllocator; -+} -+EXPORT_SYMBOL_GPL(toi_get_next_filter); -+ -+/** -+ * toi_show_modules: Printk what support is loaded. -+ */ -+void toi_print_modules(void) -+{ -+ struct toi_module_ops *this_module; -+ int prev = 0; -+ -+ printk(KERN_INFO "TuxOnIce " TOI_CORE_VERSION ", with support for"); -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ if (this_module->type == MISC_HIDDEN_MODULE) -+ continue; -+ printk("%s %s%s%s", prev ? "," : "", -+ this_module->enabled ? "" : "[", -+ this_module->name, -+ this_module->enabled ? "" : "]"); -+ prev = 1; -+ } -+ -+ printk(".\n"); -+} -+ -+/* toi_get_modules -+ * -+ * Take a reference to modules so they can't go away under us. -+ */ -+ -+int toi_get_modules(void) -+{ -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) { -+ struct toi_module_ops *this_module2; -+ -+ if (try_module_get(this_module->module)) -+ continue; -+ -+ /* Failed! Reverse gets and return error */ -+ list_for_each_entry(this_module2, &toi_modules, -+ module_list) { -+ if (this_module == this_module2) -+ return -EINVAL; -+ module_put(this_module2->module); -+ } -+ } -+ return 0; -+} -+ -+/* toi_put_modules -+ * -+ * Release our references to modules we used. -+ */ -+ -+void toi_put_modules(void) -+{ -+ struct toi_module_ops *this_module; -+ -+ list_for_each_entry(this_module, &toi_modules, module_list) -+ module_put(this_module->module); -+} -diff --git a/kernel/power/tuxonice_modules.h b/kernel/power/tuxonice_modules.h -new file mode 100644 -index 0000000..9e198c4 ---- /dev/null -+++ b/kernel/power/tuxonice_modules.h -@@ -0,0 +1,197 @@ -+/* -+ * kernel/power/tuxonice_modules.h -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * It contains declarations for modules. Modules are additions to -+ * TuxOnIce that provide facilities such as image compression or -+ * encryption, backends for storage of the image and user interfaces. -+ * -+ */ -+ -+#ifndef TOI_MODULES_H -+#define TOI_MODULES_H -+ -+/* This is the maximum size we store in the image header for a module name */ -+#define TOI_MAX_MODULE_NAME_LENGTH 30 -+ -+struct toi_boot_kernel_data; -+ -+/* Per-module metadata */ -+struct toi_module_header { -+ char name[TOI_MAX_MODULE_NAME_LENGTH]; -+ int enabled; -+ int type; -+ int index; -+ int data_length; -+ unsigned long signature; -+}; -+ -+enum { -+ FILTER_MODULE, -+ WRITER_MODULE, -+ BIO_ALLOCATOR_MODULE, -+ MISC_MODULE, -+ MISC_HIDDEN_MODULE, -+}; -+ -+enum { -+ TOI_ASYNC, -+ TOI_SYNC -+}; -+ -+struct toi_module_ops { -+ /* Functions common to all modules */ -+ int type; -+ char *name; -+ char *directory; -+ char *shared_directory; -+ struct kobject *dir_kobj; -+ struct module *module; -+ int enabled, early, initialised; -+ struct list_head module_list; -+ -+ /* List of filters or allocators */ -+ struct list_head list, type_list; -+ -+ /* -+ * Requirements for memory and storage in -+ * the image header.. -+ */ -+ int (*memory_needed) (void); -+ int (*storage_needed) (void); -+ -+ int header_requested, header_used; -+ -+ int (*expected_compression) (void); -+ -+ /* -+ * Debug info -+ */ -+ int (*print_debug_info) (char *buffer, int size); -+ int (*save_config_info) (char *buffer); -+ void (*load_config_info) (char *buffer, int len); -+ -+ /* -+ * Initialise & cleanup - general routines called -+ * at the start and end of a cycle. -+ */ -+ int (*initialise) (int starting_cycle); -+ void (*cleanup) (int finishing_cycle); -+ -+ void (*pre_atomic_restore) (struct toi_boot_kernel_data *bkd); -+ void (*post_atomic_restore) (struct toi_boot_kernel_data *bkd); -+ -+ /* -+ * Calls for allocating storage (allocators only). -+ * -+ * Header space is requested separately and cannot fail, but the -+ * reservation is only applied when main storage is allocated. -+ * The header space reservation is thus always set prior to -+ * requesting the allocation of storage - and prior to querying -+ * how much storage is available. -+ */ -+ -+ unsigned long (*storage_available) (void); -+ void (*reserve_header_space) (unsigned long space_requested); -+ int (*register_storage) (void); -+ int (*allocate_storage) (unsigned long space_requested); -+ unsigned long (*storage_allocated) (void); -+ -+ /* -+ * Routines used in image I/O. -+ */ -+ int (*rw_init) (int rw, int stream_number); -+ int (*rw_cleanup) (int rw); -+ int (*write_page) (unsigned long index, struct page *buffer_page, -+ unsigned int buf_size); -+ int (*read_page) (unsigned long *index, struct page *buffer_page, -+ unsigned int *buf_size); -+ int (*io_flusher) (int rw); -+ -+ /* Reset module if image exists but reading aborted */ -+ void (*noresume_reset) (void); -+ -+ /* Read and write the metadata */ -+ int (*write_header_init) (void); -+ int (*write_header_cleanup) (void); -+ -+ int (*read_header_init) (void); -+ int (*read_header_cleanup) (void); -+ -+ /* To be called after read_header_init */ -+ int (*get_header_version) (void); -+ -+ int (*rw_header_chunk) (int rw, struct toi_module_ops *owner, -+ char *buffer_start, int buffer_size); -+ -+ int (*rw_header_chunk_noreadahead) (int rw, -+ struct toi_module_ops *owner, char *buffer_start, -+ int buffer_size); -+ -+ /* Attempt to parse an image location */ -+ int (*parse_sig_location) (char *buffer, int only_writer, int quiet); -+ -+ /* Throttle I/O according to throughput */ -+ void (*update_throughput_throttle) (int jif_index); -+ -+ /* Flush outstanding I/O */ -+ int (*finish_all_io) (void); -+ -+ /* Determine whether image exists that we can restore */ -+ int (*image_exists) (int quiet); -+ -+ /* Mark the image as having tried to resume */ -+ int (*mark_resume_attempted) (int); -+ -+ /* Destroy image if one exists */ -+ int (*remove_image) (void); -+ -+ /* Sysfs Data */ -+ struct toi_sysfs_data *sysfs_data; -+ int num_sysfs_entries; -+ -+ /* Block I/O allocator */ -+ struct toi_bio_allocator_ops *bio_allocator_ops; -+}; -+ -+extern int toi_num_modules, toiNumAllocators; -+ -+extern struct toi_module_ops *toiActiveAllocator; -+extern struct list_head toi_filters, toiAllocators, toi_modules; -+ -+extern void toi_prepare_console_modules(void); -+extern void toi_cleanup_console_modules(void); -+ -+extern struct toi_module_ops *toi_find_module_given_name(char *name); -+extern struct toi_module_ops *toi_get_next_filter(struct toi_module_ops *); -+ -+extern int toi_register_module(struct toi_module_ops *module); -+extern void toi_move_module_tail(struct toi_module_ops *module); -+ -+extern long toi_header_storage_for_modules(void); -+extern long toi_memory_for_modules(int print_parts); -+extern void print_toi_header_storage_for_modules(void); -+extern int toi_expected_compression_ratio(void); -+ -+extern int toi_print_module_debug_info(char *buffer, int buffer_size); -+extern int toi_register_module(struct toi_module_ops *module); -+extern void toi_unregister_module(struct toi_module_ops *module); -+ -+extern int toi_initialise_modules(int starting_cycle, int early); -+#define toi_initialise_modules_early(starting) \ -+ toi_initialise_modules(starting, 1) -+#define toi_initialise_modules_late(starting) \ -+ toi_initialise_modules(starting, 0) -+extern void toi_cleanup_modules(int finishing_cycle); -+ -+extern void toi_post_atomic_restore_modules(struct toi_boot_kernel_data *bkd); -+extern void toi_pre_atomic_restore_modules(struct toi_boot_kernel_data *bkd); -+ -+extern void toi_print_modules(void); -+ -+int toi_get_modules(void); -+void toi_put_modules(void); -+#endif -diff --git a/kernel/power/tuxonice_netlink.c b/kernel/power/tuxonice_netlink.c -new file mode 100644 -index 0000000..4c599d5 ---- /dev/null -+++ b/kernel/power/tuxonice_netlink.c -@@ -0,0 +1,344 @@ -+/* -+ * kernel/power/tuxonice_netlink.c -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Functions for communicating with a userspace helper via netlink. -+ */ -+ -+ -+#include -+#include -+#include "tuxonice_netlink.h" -+#include "tuxonice.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_alloc.h" -+ -+static struct user_helper_data *uhd_list; -+ -+/* -+ * Refill our pool of SKBs for use in emergencies (eg, when eating memory and -+ * none can be allocated). -+ */ -+static void toi_fill_skb_pool(struct user_helper_data *uhd) -+{ -+ while (uhd->pool_level < uhd->pool_limit) { -+ struct sk_buff *new_skb = -+ alloc_skb(NLMSG_SPACE(uhd->skb_size), TOI_ATOMIC_GFP); -+ -+ if (!new_skb) -+ break; -+ -+ new_skb->next = uhd->emerg_skbs; -+ uhd->emerg_skbs = new_skb; -+ uhd->pool_level++; -+ } -+} -+ -+/* -+ * Try to allocate a single skb. If we can't get one, try to use one from -+ * our pool. -+ */ -+static struct sk_buff *toi_get_skb(struct user_helper_data *uhd) -+{ -+ struct sk_buff *skb = -+ alloc_skb(NLMSG_SPACE(uhd->skb_size), TOI_ATOMIC_GFP); -+ -+ if (skb) -+ return skb; -+ -+ skb = uhd->emerg_skbs; -+ if (skb) { -+ uhd->pool_level--; -+ uhd->emerg_skbs = skb->next; -+ skb->next = NULL; -+ } -+ -+ return skb; -+} -+ -+static void put_skb(struct user_helper_data *uhd, struct sk_buff *skb) -+{ -+ if (uhd->pool_level < uhd->pool_limit) { -+ skb->next = uhd->emerg_skbs; -+ uhd->emerg_skbs = skb; -+ } else -+ kfree_skb(skb); -+} -+ -+void toi_send_netlink_message(struct user_helper_data *uhd, -+ int type, void *params, size_t len) -+{ -+ struct sk_buff *skb; -+ struct nlmsghdr *nlh; -+ void *dest; -+ struct task_struct *t; -+ -+ if (uhd->pid == -1) -+ return; -+ -+ if (uhd->debug) -+ printk(KERN_ERR "toi_send_netlink_message: Send " -+ "message type %d.\n", type); -+ -+ skb = toi_get_skb(uhd); -+ if (!skb) { -+ printk(KERN_INFO "toi_netlink: Can't allocate skb!\n"); -+ return; -+ } -+ -+ /* NLMSG_PUT contains a hidden goto nlmsg_failure */ -+ nlh = NLMSG_PUT(skb, 0, uhd->sock_seq, type, len); -+ uhd->sock_seq++; -+ -+ dest = NLMSG_DATA(nlh); -+ if (params && len > 0) -+ memcpy(dest, params, len); -+ -+ netlink_unicast(uhd->nl, skb, uhd->pid, 0); -+ -+ read_lock(&tasklist_lock); -+ t = find_task_by_pid_ns(uhd->pid, &init_pid_ns); -+ if (!t) { -+ read_unlock(&tasklist_lock); -+ if (uhd->pid > -1) -+ printk(KERN_INFO "Hmm. Can't find the userspace task" -+ " %d.\n", uhd->pid); -+ return; -+ } -+ wake_up_process(t); -+ read_unlock(&tasklist_lock); -+ -+ yield(); -+ -+ return; -+ -+nlmsg_failure: -+ if (skb) -+ put_skb(uhd, skb); -+ -+ if (uhd->debug) -+ printk(KERN_ERR "toi_send_netlink_message: Failed to send " -+ "message type %d.\n", type); -+} -+EXPORT_SYMBOL_GPL(toi_send_netlink_message); -+ -+static void send_whether_debugging(struct user_helper_data *uhd) -+{ -+ static u8 is_debugging = 1; -+ -+ toi_send_netlink_message(uhd, NETLINK_MSG_IS_DEBUGGING, -+ &is_debugging, sizeof(u8)); -+} -+ -+/* -+ * Set the PF_NOFREEZE flag on the given process to ensure it can run whilst we -+ * are hibernating. -+ */ -+static int nl_set_nofreeze(struct user_helper_data *uhd, __u32 pid) -+{ -+ struct task_struct *t; -+ -+ if (uhd->debug) -+ printk(KERN_ERR "nl_set_nofreeze for pid %d.\n", pid); -+ -+ read_lock(&tasklist_lock); -+ t = find_task_by_pid_ns(pid, &init_pid_ns); -+ if (!t) { -+ read_unlock(&tasklist_lock); -+ printk(KERN_INFO "Strange. Can't find the userspace task %d.\n", -+ pid); -+ return -EINVAL; -+ } -+ -+ t->flags |= PF_NOFREEZE; -+ -+ read_unlock(&tasklist_lock); -+ uhd->pid = pid; -+ -+ toi_send_netlink_message(uhd, NETLINK_MSG_NOFREEZE_ACK, NULL, 0); -+ -+ return 0; -+} -+ -+/* -+ * Called when the userspace process has informed us that it's ready to roll. -+ */ -+static int nl_ready(struct user_helper_data *uhd, u32 version) -+{ -+ if (version != uhd->interface_version) { -+ printk(KERN_INFO "%s userspace process using invalid interface" -+ " version (%d - kernel wants %d). Trying to " -+ "continue without it.\n", -+ uhd->name, version, uhd->interface_version); -+ if (uhd->not_ready) -+ uhd->not_ready(); -+ return -EINVAL; -+ } -+ -+ complete(&uhd->wait_for_process); -+ -+ return 0; -+} -+ -+void toi_netlink_close_complete(struct user_helper_data *uhd) -+{ -+ if (uhd->nl) { -+ netlink_kernel_release(uhd->nl); -+ uhd->nl = NULL; -+ } -+ -+ while (uhd->emerg_skbs) { -+ struct sk_buff *next = uhd->emerg_skbs->next; -+ kfree_skb(uhd->emerg_skbs); -+ uhd->emerg_skbs = next; -+ } -+ -+ uhd->pid = -1; -+} -+EXPORT_SYMBOL_GPL(toi_netlink_close_complete); -+ -+static int toi_nl_gen_rcv_msg(struct user_helper_data *uhd, -+ struct sk_buff *skb, struct nlmsghdr *nlh) -+{ -+ int type = nlh->nlmsg_type; -+ int *data; -+ int err; -+ -+ if (uhd->debug) -+ printk(KERN_ERR "toi_user_rcv_skb: Received message %d.\n", -+ type); -+ -+ /* Let the more specific handler go first. It returns -+ * 1 for valid messages that it doesn't know. */ -+ err = uhd->rcv_msg(skb, nlh); -+ if (err != 1) -+ return err; -+ -+ /* Only allow one task to receive NOFREEZE privileges */ -+ if (type == NETLINK_MSG_NOFREEZE_ME && uhd->pid != -1) { -+ printk(KERN_INFO "Received extra nofreeze me requests.\n"); -+ return -EBUSY; -+ } -+ -+ data = NLMSG_DATA(nlh); -+ -+ switch (type) { -+ case NETLINK_MSG_NOFREEZE_ME: -+ return nl_set_nofreeze(uhd, nlh->nlmsg_pid); -+ case NETLINK_MSG_GET_DEBUGGING: -+ send_whether_debugging(uhd); -+ return 0; -+ case NETLINK_MSG_READY: -+ if (nlh->nlmsg_len != NLMSG_LENGTH(sizeof(u32))) { -+ printk(KERN_INFO "Invalid ready mesage.\n"); -+ if (uhd->not_ready) -+ uhd->not_ready(); -+ return -EINVAL; -+ } -+ return nl_ready(uhd, (u32) *data); -+ case NETLINK_MSG_CLEANUP: -+ toi_netlink_close_complete(uhd); -+ return 0; -+ } -+ -+ return -EINVAL; -+} -+ -+static void toi_user_rcv_skb(struct sk_buff *skb) -+{ -+ int err; -+ struct nlmsghdr *nlh; -+ struct user_helper_data *uhd = uhd_list; -+ -+ while (uhd && uhd->netlink_id != skb->sk->sk_protocol) -+ uhd = uhd->next; -+ -+ if (!uhd) -+ return; -+ -+ while (skb->len >= NLMSG_SPACE(0)) { -+ u32 rlen; -+ -+ nlh = (struct nlmsghdr *) skb->data; -+ if (nlh->nlmsg_len < sizeof(*nlh) || skb->len < nlh->nlmsg_len) -+ return; -+ -+ rlen = NLMSG_ALIGN(nlh->nlmsg_len); -+ if (rlen > skb->len) -+ rlen = skb->len; -+ -+ err = toi_nl_gen_rcv_msg(uhd, skb, nlh); -+ if (err) -+ netlink_ack(skb, nlh, err); -+ else if (nlh->nlmsg_flags & NLM_F_ACK) -+ netlink_ack(skb, nlh, 0); -+ skb_pull(skb, rlen); -+ } -+} -+ -+static int netlink_prepare(struct user_helper_data *uhd) -+{ -+ uhd->next = uhd_list; -+ uhd_list = uhd; -+ -+ uhd->sock_seq = 0x42c0ffee; -+ uhd->nl = netlink_kernel_create(&init_net, uhd->netlink_id, 0, -+ toi_user_rcv_skb, NULL, THIS_MODULE); -+ if (!uhd->nl) { -+ printk(KERN_INFO "Failed to allocate netlink socket for %s.\n", -+ uhd->name); -+ return -ENOMEM; -+ } -+ -+ toi_fill_skb_pool(uhd); -+ -+ return 0; -+} -+ -+void toi_netlink_close(struct user_helper_data *uhd) -+{ -+ struct task_struct *t; -+ -+ read_lock(&tasklist_lock); -+ t = find_task_by_pid_ns(uhd->pid, &init_pid_ns); -+ if (t) -+ t->flags &= ~PF_NOFREEZE; -+ read_unlock(&tasklist_lock); -+ -+ toi_send_netlink_message(uhd, NETLINK_MSG_CLEANUP, NULL, 0); -+} -+EXPORT_SYMBOL_GPL(toi_netlink_close); -+ -+int toi_netlink_setup(struct user_helper_data *uhd) -+{ -+ /* In case userui didn't cleanup properly on us */ -+ toi_netlink_close_complete(uhd); -+ -+ if (netlink_prepare(uhd) < 0) { -+ printk(KERN_INFO "Netlink prepare failed.\n"); -+ return 1; -+ } -+ -+ if (toi_launch_userspace_program(uhd->program, uhd->netlink_id, -+ UMH_WAIT_EXEC, uhd->debug) < 0) { -+ printk(KERN_INFO "Launch userspace program failed.\n"); -+ toi_netlink_close_complete(uhd); -+ return 1; -+ } -+ -+ /* Wait 2 seconds for the userspace process to make contact */ -+ wait_for_completion_timeout(&uhd->wait_for_process, 2*HZ); -+ -+ if (uhd->pid == -1) { -+ printk(KERN_INFO "%s: Failed to contact userspace process.\n", -+ uhd->name); -+ toi_netlink_close_complete(uhd); -+ return 1; -+ } -+ -+ return 0; -+} -+EXPORT_SYMBOL_GPL(toi_netlink_setup); -diff --git a/kernel/power/tuxonice_netlink.h b/kernel/power/tuxonice_netlink.h -new file mode 100644 -index 0000000..b8ef06e ---- /dev/null -+++ b/kernel/power/tuxonice_netlink.h -@@ -0,0 +1,62 @@ -+/* -+ * kernel/power/tuxonice_netlink.h -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Declarations for functions for communicating with a userspace helper -+ * via netlink. -+ */ -+ -+#include -+#include -+ -+#define NETLINK_MSG_BASE 0x10 -+ -+#define NETLINK_MSG_READY 0x10 -+#define NETLINK_MSG_NOFREEZE_ME 0x16 -+#define NETLINK_MSG_GET_DEBUGGING 0x19 -+#define NETLINK_MSG_CLEANUP 0x24 -+#define NETLINK_MSG_NOFREEZE_ACK 0x27 -+#define NETLINK_MSG_IS_DEBUGGING 0x28 -+ -+struct user_helper_data { -+ int (*rcv_msg) (struct sk_buff *skb, struct nlmsghdr *nlh); -+ void (*not_ready) (void); -+ struct sock *nl; -+ u32 sock_seq; -+ pid_t pid; -+ char *comm; -+ char program[256]; -+ int pool_level; -+ int pool_limit; -+ struct sk_buff *emerg_skbs; -+ int skb_size; -+ int netlink_id; -+ char *name; -+ struct user_helper_data *next; -+ struct completion wait_for_process; -+ u32 interface_version; -+ int must_init; -+ int debug; -+}; -+ -+#ifdef CONFIG_NET -+int toi_netlink_setup(struct user_helper_data *uhd); -+void toi_netlink_close(struct user_helper_data *uhd); -+void toi_send_netlink_message(struct user_helper_data *uhd, -+ int type, void *params, size_t len); -+void toi_netlink_close_complete(struct user_helper_data *uhd); -+#else -+static inline int toi_netlink_setup(struct user_helper_data *uhd) -+{ -+ return 0; -+} -+ -+static inline void toi_netlink_close(struct user_helper_data *uhd) { }; -+static inline void toi_send_netlink_message(struct user_helper_data *uhd, -+ int type, void *params, size_t len) { }; -+static inline void toi_netlink_close_complete(struct user_helper_data *uhd) -+ { }; -+#endif -diff --git a/kernel/power/tuxonice_pagedir.c b/kernel/power/tuxonice_pagedir.c -new file mode 100644 -index 0000000..091c9e3 ---- /dev/null -+++ b/kernel/power/tuxonice_pagedir.c -@@ -0,0 +1,339 @@ -+/* -+ * kernel/power/tuxonice_pagedir.c -+ * -+ * Copyright (C) 1998-2001 Gabor Kuti -+ * Copyright (C) 1998,2001,2002 Pavel Machek -+ * Copyright (C) 2002-2003 Florent Chabaud -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Routines for handling pagesets. -+ * Note that pbes aren't actually stored as such. They're stored as -+ * bitmaps and extents. -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice_pageflags.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_pagedir.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice.h" -+#include "tuxonice_builtin.h" -+#include "tuxonice_alloc.h" -+ -+static int ptoi_pfn; -+static struct pbe *this_low_pbe; -+static struct pbe **last_low_pbe_ptr; -+static struct memory_bitmap dup_map1, dup_map2; -+ -+void toi_reset_alt_image_pageset2_pfn(void) -+{ -+ memory_bm_position_reset(pageset2_map); -+} -+ -+static struct page *first_conflicting_page; -+ -+/* -+ * free_conflicting_pages -+ */ -+ -+static void free_conflicting_pages(void) -+{ -+ while (first_conflicting_page) { -+ struct page *next = -+ *((struct page **) kmap(first_conflicting_page)); -+ kunmap(first_conflicting_page); -+ toi__free_page(29, first_conflicting_page); -+ first_conflicting_page = next; -+ } -+} -+ -+/* __toi_get_nonconflicting_page -+ * -+ * Description: Gets order zero pages that won't be overwritten -+ * while copying the original pages. -+ */ -+ -+struct page *___toi_get_nonconflicting_page(int can_be_highmem) -+{ -+ struct page *page; -+ gfp_t flags = TOI_ATOMIC_GFP; -+ if (can_be_highmem) -+ flags |= __GFP_HIGHMEM; -+ -+ -+ if (test_toi_state(TOI_LOADING_ALT_IMAGE) && -+ pageset2_map && -+ (ptoi_pfn != BM_END_OF_MAP)) { -+ do { -+ ptoi_pfn = memory_bm_next_pfn(pageset2_map); -+ if (ptoi_pfn != BM_END_OF_MAP) { -+ page = pfn_to_page(ptoi_pfn); -+ if (!PagePageset1(page) && -+ (can_be_highmem || !PageHighMem(page))) -+ return page; -+ } -+ } while (ptoi_pfn != BM_END_OF_MAP); -+ } -+ -+ do { -+ page = toi_alloc_page(29, flags); -+ if (!page) { -+ printk(KERN_INFO "Failed to get nonconflicting " -+ "page.\n"); -+ return NULL; -+ } -+ if (PagePageset1(page)) { -+ struct page **next = (struct page **) kmap(page); -+ *next = first_conflicting_page; -+ first_conflicting_page = page; -+ kunmap(page); -+ } -+ } while (PagePageset1(page)); -+ -+ return page; -+} -+ -+unsigned long __toi_get_nonconflicting_page(void) -+{ -+ struct page *page = ___toi_get_nonconflicting_page(0); -+ return page ? (unsigned long) page_address(page) : 0; -+} -+ -+static struct pbe *get_next_pbe(struct page **page_ptr, struct pbe *this_pbe, -+ int highmem) -+{ -+ if (((((unsigned long) this_pbe) & (PAGE_SIZE - 1)) -+ + 2 * sizeof(struct pbe)) > PAGE_SIZE) { -+ struct page *new_page = -+ ___toi_get_nonconflicting_page(highmem); -+ if (!new_page) -+ return ERR_PTR(-ENOMEM); -+ this_pbe = (struct pbe *) kmap(new_page); -+ memset(this_pbe, 0, PAGE_SIZE); -+ *page_ptr = new_page; -+ } else -+ this_pbe++; -+ -+ return this_pbe; -+} -+ -+/** -+ * get_pageset1_load_addresses - generate pbes for conflicting pages -+ * -+ * We check here that pagedir & pages it points to won't collide -+ * with pages where we're going to restore from the loaded pages -+ * later. -+ * -+ * Returns: -+ * Zero on success, one if couldn't find enough pages (shouldn't -+ * happen). -+ **/ -+int toi_get_pageset1_load_addresses(void) -+{ -+ int pfn, highallocd = 0, lowallocd = 0; -+ int low_needed = pagedir1.size - get_highmem_size(pagedir1); -+ int high_needed = get_highmem_size(pagedir1); -+ int low_pages_for_highmem = 0; -+ gfp_t flags = GFP_ATOMIC | __GFP_NOWARN | __GFP_HIGHMEM; -+ struct page *page, *high_pbe_page = NULL, *last_high_pbe_page = NULL, -+ *low_pbe_page; -+ struct pbe **last_high_pbe_ptr = &restore_highmem_pblist, -+ *this_high_pbe = NULL; -+ int orig_low_pfn, orig_high_pfn; -+ int high_pbes_done = 0, low_pbes_done = 0; -+ int low_direct = 0, high_direct = 0, result = 0, i; -+ -+ /* -+ * We need to duplicate pageset1's map because memory_bm_next_pfn's -+ * state gets stomped on by the PagePageset1() test in setup_pbes. -+ */ -+ memory_bm_create(&dup_map1, GFP_ATOMIC, 0); -+ memory_bm_dup(pageset1_map, &dup_map1); -+ -+ memory_bm_create(&dup_map2, GFP_ATOMIC, 0); -+ memory_bm_dup(pageset1_map, &dup_map2); -+ -+ memory_bm_position_reset(pageset1_map); -+ memory_bm_position_reset(&dup_map1); -+ memory_bm_position_reset(&dup_map2); -+ -+ last_low_pbe_ptr = &restore_pblist; -+ -+ /* First, allocate pages for the start of our pbe lists. */ -+ if (high_needed) { -+ high_pbe_page = ___toi_get_nonconflicting_page(1); -+ if (!high_pbe_page) { -+ result = -ENOMEM; -+ goto out; -+ } -+ this_high_pbe = (struct pbe *) kmap(high_pbe_page); -+ memset(this_high_pbe, 0, PAGE_SIZE); -+ } -+ -+ low_pbe_page = ___toi_get_nonconflicting_page(0); -+ if (!low_pbe_page) { -+ result = -ENOMEM; -+ goto out; -+ } -+ this_low_pbe = (struct pbe *) page_address(low_pbe_page); -+ -+ /* -+ * Next, allocate the number of pages we need. -+ */ -+ -+ i = low_needed + high_needed; -+ -+ do { -+ int is_high; -+ -+ if (i == low_needed) -+ flags &= ~__GFP_HIGHMEM; -+ -+ page = toi_alloc_page(30, flags); -+ BUG_ON(!page); -+ -+ SetPagePageset1Copy(page); -+ is_high = PageHighMem(page); -+ -+ if (PagePageset1(page)) { -+ if (is_high) -+ high_direct++; -+ else -+ low_direct++; -+ } else { -+ if (is_high) -+ highallocd++; -+ else -+ lowallocd++; -+ } -+ } while (--i); -+ -+ high_needed -= high_direct; -+ low_needed -= low_direct; -+ -+ /* -+ * Do we need to use some lowmem pages for the copies of highmem -+ * pages? -+ */ -+ if (high_needed > highallocd) { -+ low_pages_for_highmem = high_needed - highallocd; -+ high_needed -= low_pages_for_highmem; -+ low_needed += low_pages_for_highmem; -+ } -+ -+ /* -+ * Now generate our pbes (which will be used for the atomic restore), -+ * and free unneeded pages. -+ */ -+ memory_bm_position_reset(pageset1_copy_map); -+ for (pfn = memory_bm_next_pfn(pageset1_copy_map); pfn != BM_END_OF_MAP; -+ pfn = memory_bm_next_pfn(pageset1_copy_map)) { -+ int is_high; -+ page = pfn_to_page(pfn); -+ is_high = PageHighMem(page); -+ -+ if (PagePageset1(page)) -+ continue; -+ -+ /* Nope. We're going to use this page. Add a pbe. */ -+ if (is_high || low_pages_for_highmem) { -+ struct page *orig_page; -+ high_pbes_done++; -+ if (!is_high) -+ low_pages_for_highmem--; -+ do { -+ orig_high_pfn = memory_bm_next_pfn(&dup_map1); -+ BUG_ON(orig_high_pfn == BM_END_OF_MAP); -+ orig_page = pfn_to_page(orig_high_pfn); -+ } while (!PageHighMem(orig_page) || -+ PagePageset1Copy(orig_page)); -+ -+ this_high_pbe->orig_address = orig_page; -+ this_high_pbe->address = page; -+ this_high_pbe->next = NULL; -+ if (last_high_pbe_page != high_pbe_page) { -+ *last_high_pbe_ptr = -+ (struct pbe *) high_pbe_page; -+ if (!last_high_pbe_page) -+ last_high_pbe_page = high_pbe_page; -+ } else -+ *last_high_pbe_ptr = this_high_pbe; -+ last_high_pbe_ptr = &this_high_pbe->next; -+ if (last_high_pbe_page != high_pbe_page) { -+ kunmap(last_high_pbe_page); -+ last_high_pbe_page = high_pbe_page; -+ } -+ this_high_pbe = get_next_pbe(&high_pbe_page, -+ this_high_pbe, 1); -+ if (IS_ERR(this_high_pbe)) { -+ printk(KERN_INFO -+ "This high pbe is an error.\n"); -+ return -ENOMEM; -+ } -+ } else { -+ struct page *orig_page; -+ low_pbes_done++; -+ do { -+ orig_low_pfn = memory_bm_next_pfn(&dup_map2); -+ BUG_ON(orig_low_pfn == BM_END_OF_MAP); -+ orig_page = pfn_to_page(orig_low_pfn); -+ } while (PageHighMem(orig_page) || -+ PagePageset1Copy(orig_page)); -+ -+ this_low_pbe->orig_address = page_address(orig_page); -+ this_low_pbe->address = page_address(page); -+ this_low_pbe->next = NULL; -+ *last_low_pbe_ptr = this_low_pbe; -+ last_low_pbe_ptr = &this_low_pbe->next; -+ this_low_pbe = get_next_pbe(&low_pbe_page, -+ this_low_pbe, 0); -+ if (IS_ERR(this_low_pbe)) { -+ printk(KERN_INFO "this_low_pbe is an error.\n"); -+ return -ENOMEM; -+ } -+ } -+ } -+ -+ if (high_pbe_page) -+ kunmap(high_pbe_page); -+ -+ if (last_high_pbe_page != high_pbe_page) { -+ if (last_high_pbe_page) -+ kunmap(last_high_pbe_page); -+ toi__free_page(29, high_pbe_page); -+ } -+ -+ free_conflicting_pages(); -+ -+out: -+ memory_bm_free(&dup_map1, 0); -+ memory_bm_free(&dup_map2, 0); -+ -+ return result; -+} -+ -+int add_boot_kernel_data_pbe(void) -+{ -+ this_low_pbe->address = (char *) __toi_get_nonconflicting_page(); -+ if (!this_low_pbe->address) { -+ printk(KERN_INFO "Failed to get bkd atomic restore buffer."); -+ return -ENOMEM; -+ } -+ -+ toi_bkd.size = sizeof(toi_bkd); -+ memcpy(this_low_pbe->address, &toi_bkd, sizeof(toi_bkd)); -+ -+ *last_low_pbe_ptr = this_low_pbe; -+ this_low_pbe->orig_address = (char *) boot_kernel_data_buffer; -+ this_low_pbe->next = NULL; -+ return 0; -+} -diff --git a/kernel/power/tuxonice_pagedir.h b/kernel/power/tuxonice_pagedir.h -new file mode 100644 -index 0000000..d08e4b1 ---- /dev/null -+++ b/kernel/power/tuxonice_pagedir.h -@@ -0,0 +1,50 @@ -+/* -+ * kernel/power/tuxonice_pagedir.h -+ * -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Declarations for routines for handling pagesets. -+ */ -+ -+#ifndef KERNEL_POWER_PAGEDIR_H -+#define KERNEL_POWER_PAGEDIR_H -+ -+/* Pagedir -+ * -+ * Contains the metadata for a set of pages saved in the image. -+ */ -+ -+struct pagedir { -+ int id; -+ unsigned long size; -+#ifdef CONFIG_HIGHMEM -+ unsigned long size_high; -+#endif -+}; -+ -+#ifdef CONFIG_HIGHMEM -+#define get_highmem_size(pagedir) (pagedir.size_high) -+#define set_highmem_size(pagedir, sz) do { pagedir.size_high = sz; } while (0) -+#define inc_highmem_size(pagedir) do { pagedir.size_high++; } while (0) -+#define get_lowmem_size(pagedir) (pagedir.size - pagedir.size_high) -+#else -+#define get_highmem_size(pagedir) (0) -+#define set_highmem_size(pagedir, sz) do { } while (0) -+#define inc_highmem_size(pagedir) do { } while (0) -+#define get_lowmem_size(pagedir) (pagedir.size) -+#endif -+ -+extern struct pagedir pagedir1, pagedir2; -+ -+extern void toi_copy_pageset1(void); -+ -+extern int toi_get_pageset1_load_addresses(void); -+ -+extern unsigned long __toi_get_nonconflicting_page(void); -+struct page *___toi_get_nonconflicting_page(int can_be_highmem); -+ -+extern void toi_reset_alt_image_pageset2_pfn(void); -+extern int add_boot_kernel_data_pbe(void); -+#endif -diff --git a/kernel/power/tuxonice_pageflags.c b/kernel/power/tuxonice_pageflags.c -new file mode 100644 -index 0000000..e9ec5b5 ---- /dev/null -+++ b/kernel/power/tuxonice_pageflags.c -@@ -0,0 +1,28 @@ -+/* -+ * kernel/power/tuxonice_pageflags.c -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Routines for serialising and relocating pageflags in which we -+ * store our image metadata. -+ */ -+ -+#include -+#include "tuxonice_pageflags.h" -+#include "power.h" -+ -+int toi_pageflags_space_needed(void) -+{ -+ int total = 0; -+ struct bm_block *bb; -+ -+ total = sizeof(unsigned int); -+ -+ list_for_each_entry(bb, &pageset1_map->blocks, hook) -+ total += 2 * sizeof(unsigned long) + PAGE_SIZE; -+ -+ return total; -+} -+EXPORT_SYMBOL_GPL(toi_pageflags_space_needed); -diff --git a/kernel/power/tuxonice_pageflags.h b/kernel/power/tuxonice_pageflags.h -new file mode 100644 -index 0000000..d5aa7b1 ---- /dev/null -+++ b/kernel/power/tuxonice_pageflags.h -@@ -0,0 +1,72 @@ -+/* -+ * kernel/power/tuxonice_pageflags.h -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ */ -+ -+#ifndef KERNEL_POWER_TUXONICE_PAGEFLAGS_H -+#define KERNEL_POWER_TUXONICE_PAGEFLAGS_H -+ -+extern struct memory_bitmap *pageset1_map; -+extern struct memory_bitmap *pageset1_copy_map; -+extern struct memory_bitmap *pageset2_map; -+extern struct memory_bitmap *page_resave_map; -+extern struct memory_bitmap *io_map; -+extern struct memory_bitmap *nosave_map; -+extern struct memory_bitmap *free_map; -+ -+#define PagePageset1(page) \ -+ (memory_bm_test_bit(pageset1_map, page_to_pfn(page))) -+#define SetPagePageset1(page) \ -+ (memory_bm_set_bit(pageset1_map, page_to_pfn(page))) -+#define ClearPagePageset1(page) \ -+ (memory_bm_clear_bit(pageset1_map, page_to_pfn(page))) -+ -+#define PagePageset1Copy(page) \ -+ (memory_bm_test_bit(pageset1_copy_map, page_to_pfn(page))) -+#define SetPagePageset1Copy(page) \ -+ (memory_bm_set_bit(pageset1_copy_map, page_to_pfn(page))) -+#define ClearPagePageset1Copy(page) \ -+ (memory_bm_clear_bit(pageset1_copy_map, page_to_pfn(page))) -+ -+#define PagePageset2(page) \ -+ (memory_bm_test_bit(pageset2_map, page_to_pfn(page))) -+#define SetPagePageset2(page) \ -+ (memory_bm_set_bit(pageset2_map, page_to_pfn(page))) -+#define ClearPagePageset2(page) \ -+ (memory_bm_clear_bit(pageset2_map, page_to_pfn(page))) -+ -+#define PageWasRW(page) \ -+ (memory_bm_test_bit(pageset2_map, page_to_pfn(page))) -+#define SetPageWasRW(page) \ -+ (memory_bm_set_bit(pageset2_map, page_to_pfn(page))) -+#define ClearPageWasRW(page) \ -+ (memory_bm_clear_bit(pageset2_map, page_to_pfn(page))) -+ -+#define PageResave(page) (page_resave_map ? \ -+ memory_bm_test_bit(page_resave_map, page_to_pfn(page)) : 0) -+#define SetPageResave(page) \ -+ (memory_bm_set_bit(page_resave_map, page_to_pfn(page))) -+#define ClearPageResave(page) \ -+ (memory_bm_clear_bit(page_resave_map, page_to_pfn(page))) -+ -+#define PageNosave(page) (nosave_map ? \ -+ memory_bm_test_bit(nosave_map, page_to_pfn(page)) : 0) -+#define SetPageNosave(page) \ -+ (memory_bm_set_bit(nosave_map, page_to_pfn(page))) -+#define ClearPageNosave(page) \ -+ (memory_bm_clear_bit(nosave_map, page_to_pfn(page))) -+ -+#define PageNosaveFree(page) (free_map ? \ -+ memory_bm_test_bit(free_map, page_to_pfn(page)) : 0) -+#define SetPageNosaveFree(page) \ -+ (memory_bm_set_bit(free_map, page_to_pfn(page))) -+#define ClearPageNosaveFree(page) \ -+ (memory_bm_clear_bit(free_map, page_to_pfn(page))) -+ -+extern void save_pageflags(struct memory_bitmap *pagemap); -+extern int load_pageflags(struct memory_bitmap *pagemap); -+extern int toi_pageflags_space_needed(void); -+#endif -diff --git a/kernel/power/tuxonice_power_off.c b/kernel/power/tuxonice_power_off.c -new file mode 100644 -index 0000000..07e39c0 ---- /dev/null -+++ b/kernel/power/tuxonice_power_off.c -@@ -0,0 +1,285 @@ -+/* -+ * kernel/power/tuxonice_power_off.c -+ * -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Support for powering down. -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include "tuxonice.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_power_off.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_io.h" -+ -+unsigned long toi_poweroff_method; /* 0 - Kernel power off */ -+EXPORT_SYMBOL_GPL(toi_poweroff_method); -+ -+static int wake_delay; -+static char lid_state_file[256], wake_alarm_dir[256]; -+static struct file *lid_file, *alarm_file, *epoch_file; -+static int post_wake_state = -1; -+ -+static int did_suspend_to_both; -+ -+/* -+ * __toi_power_down -+ * Functionality : Powers down or reboots the computer once the image -+ * has been written to disk. -+ * Key Assumptions : Able to reboot/power down via code called or that -+ * the warning emitted if the calls fail will be visible -+ * to the user (ie printk resumes devices). -+ */ -+ -+static void __toi_power_down(int method) -+{ -+ int error; -+ -+ toi_cond_pause(1, test_action_state(TOI_REBOOT) ? "Ready to reboot." : -+ "Powering down."); -+ -+ if (test_result_state(TOI_ABORTED)) -+ goto out; -+ -+ if (test_action_state(TOI_REBOOT)) -+ kernel_restart(NULL); -+ -+ switch (method) { -+ case 0: -+ break; -+ case 3: -+ /* -+ * Re-read the overwritten part of pageset2 to make post-resume -+ * faster. -+ */ -+ if (read_pageset2(1)) -+ panic("Attempt to reload pagedir 2 failed. " -+ "Try rebooting."); -+ -+ pm_prepare_console(); -+ -+ error = pm_notifier_call_chain(PM_SUSPEND_PREPARE); -+ if (!error) { -+ error = suspend_devices_and_enter(PM_SUSPEND_MEM); -+ if (!error) -+ did_suspend_to_both = 1; -+ } -+ pm_notifier_call_chain(PM_POST_SUSPEND); -+ pm_restore_console(); -+ -+ /* Success - we're now post-resume-from-ram */ -+ if (did_suspend_to_both) -+ return; -+ -+ /* Failed to suspend to ram - do normal power off */ -+ break; -+ case 4: -+ /* -+ * If succeeds, doesn't return. If fails, do a simple -+ * powerdown. -+ */ -+ hibernation_platform_enter(); -+ break; -+ case 5: -+ /* Historic entry only now */ -+ break; -+ } -+ -+ if (method && method != 5) -+ toi_cond_pause(1, -+ "Falling back to alternate power off method."); -+ -+ if (test_result_state(TOI_ABORTED)) -+ goto out; -+ -+ kernel_power_off(); -+ kernel_halt(); -+ toi_cond_pause(1, "Powerdown failed."); -+ while (1) -+ cpu_relax(); -+ -+out: -+ if (read_pageset2(1)) -+ panic("Attempt to reload pagedir 2 failed. Try rebooting."); -+ return; -+} -+ -+#define CLOSE_FILE(file) \ -+ if (file) { \ -+ filp_close(file, NULL); file = NULL; \ -+ } -+ -+static void powerdown_cleanup(int toi_or_resume) -+{ -+ if (!toi_or_resume) -+ return; -+ -+ CLOSE_FILE(lid_file); -+ CLOSE_FILE(alarm_file); -+ CLOSE_FILE(epoch_file); -+} -+ -+static void open_file(char *format, char *arg, struct file **var, int mode, -+ char *desc) -+{ -+ char buf[256]; -+ -+ if (strlen(arg)) { -+ sprintf(buf, format, arg); -+ *var = filp_open(buf, mode, 0); -+ if (IS_ERR(*var) || !*var) { -+ printk(KERN_INFO "Failed to open %s file '%s' (%p).\n", -+ desc, buf, *var); -+ *var = NULL; -+ } -+ } -+} -+ -+static int powerdown_init(int toi_or_resume) -+{ -+ if (!toi_or_resume) -+ return 0; -+ -+ did_suspend_to_both = 0; -+ -+ open_file("/proc/acpi/button/%s/state", lid_state_file, &lid_file, -+ O_RDONLY, "lid"); -+ -+ if (strlen(wake_alarm_dir)) { -+ open_file("/sys/class/rtc/%s/wakealarm", wake_alarm_dir, -+ &alarm_file, O_WRONLY, "alarm"); -+ -+ open_file("/sys/class/rtc/%s/since_epoch", wake_alarm_dir, -+ &epoch_file, O_RDONLY, "epoch"); -+ } -+ -+ return 0; -+} -+ -+static int lid_closed(void) -+{ -+ char array[25]; -+ ssize_t size; -+ loff_t pos = 0; -+ -+ if (!lid_file) -+ return 0; -+ -+ size = vfs_read(lid_file, (char __user *) array, 25, &pos); -+ if ((int) size < 1) { -+ printk(KERN_INFO "Failed to read lid state file (%d).\n", -+ (int) size); -+ return 0; -+ } -+ -+ if (!strcmp(array, "state: closed\n")) -+ return 1; -+ -+ return 0; -+} -+ -+static void write_alarm_file(int value) -+{ -+ ssize_t size; -+ char buf[40]; -+ loff_t pos = 0; -+ -+ if (!alarm_file) -+ return; -+ -+ sprintf(buf, "%d\n", value); -+ -+ size = vfs_write(alarm_file, (char __user *)buf, strlen(buf), &pos); -+ -+ if (size < 0) -+ printk(KERN_INFO "Error %d writing alarm value %s.\n", -+ (int) size, buf); -+} -+ -+/** -+ * toi_check_resleep: See whether to powerdown again after waking. -+ * -+ * After waking, check whether we should powerdown again in a (usually -+ * different) way. We only do this if the lid switch is still closed. -+ */ -+void toi_check_resleep(void) -+{ -+ /* We only return if we suspended to ram and woke. */ -+ if (lid_closed() && post_wake_state >= 0) -+ __toi_power_down(post_wake_state); -+} -+ -+void toi_power_down(void) -+{ -+ if (alarm_file && wake_delay) { -+ char array[25]; -+ loff_t pos = 0; -+ size_t size = vfs_read(epoch_file, (char __user *) array, 25, -+ &pos); -+ -+ if (((int) size) < 1) -+ printk(KERN_INFO "Failed to read epoch file (%d).\n", -+ (int) size); -+ else { -+ unsigned long since_epoch; -+ if (!strict_strtoul(array, 0, &since_epoch)) { -+ /* Clear any wakeup time. */ -+ write_alarm_file(0); -+ -+ /* Set new wakeup time. */ -+ write_alarm_file(since_epoch + wake_delay); -+ } -+ } -+ } -+ -+ __toi_power_down(toi_poweroff_method); -+ -+ toi_check_resleep(); -+} -+EXPORT_SYMBOL_GPL(toi_power_down); -+ -+static struct toi_sysfs_data sysfs_params[] = { -+#if defined(CONFIG_ACPI) -+ SYSFS_STRING("lid_file", SYSFS_RW, lid_state_file, 256, 0, NULL), -+ SYSFS_INT("wake_delay", SYSFS_RW, &wake_delay, 0, INT_MAX, 0, NULL), -+ SYSFS_STRING("wake_alarm_dir", SYSFS_RW, wake_alarm_dir, 256, 0, NULL), -+ SYSFS_INT("post_wake_state", SYSFS_RW, &post_wake_state, -1, 5, 0, -+ NULL), -+ SYSFS_UL("powerdown_method", SYSFS_RW, &toi_poweroff_method, 0, 5, 0), -+ SYSFS_INT("did_suspend_to_both", SYSFS_READONLY, &did_suspend_to_both, -+ 0, 0, 0, NULL) -+#endif -+}; -+ -+static struct toi_module_ops powerdown_ops = { -+ .type = MISC_HIDDEN_MODULE, -+ .name = "poweroff", -+ .initialise = powerdown_init, -+ .cleanup = powerdown_cleanup, -+ .directory = "[ROOT]", -+ .module = THIS_MODULE, -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+int toi_poweroff_init(void) -+{ -+ return toi_register_module(&powerdown_ops); -+} -+ -+void toi_poweroff_exit(void) -+{ -+ toi_unregister_module(&powerdown_ops); -+} -diff --git a/kernel/power/tuxonice_power_off.h b/kernel/power/tuxonice_power_off.h -new file mode 100644 -index 0000000..9aa0ea8 ---- /dev/null -+++ b/kernel/power/tuxonice_power_off.h -@@ -0,0 +1,24 @@ -+/* -+ * kernel/power/tuxonice_power_off.h -+ * -+ * Copyright (C) 2006-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Support for the powering down. -+ */ -+ -+int toi_pm_state_finish(void); -+void toi_power_down(void); -+extern unsigned long toi_poweroff_method; -+int toi_poweroff_init(void); -+void toi_poweroff_exit(void); -+void toi_check_resleep(void); -+ -+extern int platform_begin(int platform_mode); -+extern int platform_pre_snapshot(int platform_mode); -+extern void platform_leave(int platform_mode); -+extern void platform_end(int platform_mode); -+extern void platform_finish(int platform_mode); -+extern int platform_pre_restore(int platform_mode); -+extern void platform_restore_cleanup(int platform_mode); -diff --git a/kernel/power/tuxonice_prepare_image.c b/kernel/power/tuxonice_prepare_image.c -new file mode 100644 -index 0000000..e58225e ---- /dev/null -+++ b/kernel/power/tuxonice_prepare_image.c -@@ -0,0 +1,1093 @@ -+/* -+ * kernel/power/tuxonice_prepare_image.c -+ * -+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * We need to eat memory until we can: -+ * 1. Perform the save without changing anything (RAM_NEEDED < #pages) -+ * 2. Fit it all in available space (toiActiveAllocator->available_space() >= -+ * main_storage_needed()) -+ * 3. Reload the pagedir and pageset1 to places that don't collide with their -+ * final destinations, not knowing to what extent the resumed kernel will -+ * overlap with the one loaded at boot time. I think the resumed kernel -+ * should overlap completely, but I don't want to rely on this as it is -+ * an unproven assumption. We therefore assume there will be no overlap at -+ * all (worse case). -+ * 4. Meet the user's requested limit (if any) on the size of the image. -+ * The limit is in MB, so pages/256 (assuming 4K pages). -+ * -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice_pageflags.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_io.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_prepare_image.h" -+#include "tuxonice.h" -+#include "tuxonice_extent.h" -+#include "tuxonice_checksum.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_atomic_copy.h" -+ -+static unsigned long num_nosave, main_storage_allocated, storage_limit, -+ header_storage_needed; -+unsigned long extra_pd1_pages_allowance = -+ CONFIG_TOI_DEFAULT_EXTRA_PAGES_ALLOWANCE; -+long image_size_limit; -+static int no_ps2_needed; -+ -+struct attention_list { -+ struct task_struct *task; -+ struct attention_list *next; -+}; -+ -+static struct attention_list *attention_list; -+ -+#define PAGESET1 0 -+#define PAGESET2 1 -+ -+void free_attention_list(void) -+{ -+ struct attention_list *last = NULL; -+ -+ while (attention_list) { -+ last = attention_list; -+ attention_list = attention_list->next; -+ toi_kfree(6, last, sizeof(*last)); -+ } -+} -+ -+static int build_attention_list(void) -+{ -+ int i, task_count = 0; -+ struct task_struct *p; -+ struct attention_list *next; -+ -+ /* -+ * Count all userspace process (with task->mm) marked PF_NOFREEZE. -+ */ -+ read_lock(&tasklist_lock); -+ for_each_process(p) -+ if ((p->flags & PF_NOFREEZE) || p == current) -+ task_count++; -+ read_unlock(&tasklist_lock); -+ -+ /* -+ * Allocate attention list structs. -+ */ -+ for (i = 0; i < task_count; i++) { -+ struct attention_list *this = -+ toi_kzalloc(6, sizeof(struct attention_list), -+ TOI_WAIT_GFP); -+ if (!this) { -+ printk(KERN_INFO "Failed to allocate slab for " -+ "attention list.\n"); -+ free_attention_list(); -+ return 1; -+ } -+ this->next = NULL; -+ if (attention_list) -+ this->next = attention_list; -+ attention_list = this; -+ } -+ -+ next = attention_list; -+ read_lock(&tasklist_lock); -+ for_each_process(p) -+ if ((p->flags & PF_NOFREEZE) || p == current) { -+ next->task = p; -+ next = next->next; -+ } -+ read_unlock(&tasklist_lock); -+ return 0; -+} -+ -+static void pageset2_full(void) -+{ -+ struct zone *zone; -+ struct page *page; -+ unsigned long flags; -+ int i; -+ -+ for_each_populated_zone(zone) { -+ spin_lock_irqsave(&zone->lru_lock, flags); -+ for_each_lru(i) { -+ if (!zone_page_state(zone, NR_LRU_BASE + i)) -+ continue; -+ -+ list_for_each_entry(page, &zone->lru[i].list, lru) { -+ struct address_space *mapping; -+ -+ mapping = page_mapping(page); -+ if (!mapping || !mapping->host || -+ !(mapping->host->i_flags & S_ATOMIC_COPY)) -+ SetPagePageset2(page); -+ } -+ } -+ spin_unlock_irqrestore(&zone->lru_lock, flags); -+ } -+} -+ -+/* -+ * toi_mark_task_as_pageset -+ * Functionality : Marks all the saveable pages belonging to a given process -+ * as belonging to a particular pageset. -+ */ -+ -+static void toi_mark_task_as_pageset(struct task_struct *t, int pageset2) -+{ -+ struct vm_area_struct *vma; -+ struct mm_struct *mm; -+ -+ mm = t->active_mm; -+ -+ if (!mm || !mm->mmap) -+ return; -+ -+ if (!irqs_disabled()) -+ down_read(&mm->mmap_sem); -+ -+ for (vma = mm->mmap; vma; vma = vma->vm_next) { -+ unsigned long posn; -+ -+ if (!vma->vm_start || -+ vma->vm_flags & (VM_IO | VM_RESERVED | VM_PFNMAP)) -+ continue; -+ -+ for (posn = vma->vm_start; posn < vma->vm_end; -+ posn += PAGE_SIZE) { -+ struct page *page = follow_page(vma, posn, 0); -+ struct address_space *mapping; -+ -+ if (!page || !pfn_valid(page_to_pfn(page))) -+ continue; -+ -+ mapping = page_mapping(page); -+ if (mapping && mapping->host && -+ mapping->host->i_flags & S_ATOMIC_COPY) -+ continue; -+ -+ if (pageset2) -+ SetPagePageset2(page); -+ else { -+ ClearPagePageset2(page); -+ SetPagePageset1(page); -+ } -+ } -+ } -+ -+ if (!irqs_disabled()) -+ up_read(&mm->mmap_sem); -+} -+ -+static void mark_tasks(int pageset) -+{ -+ struct task_struct *p; -+ -+ read_lock(&tasklist_lock); -+ for_each_process(p) { -+ if (!p->mm) -+ continue; -+ -+ if (p->flags & PF_KTHREAD) -+ continue; -+ -+ toi_mark_task_as_pageset(p, pageset); -+ } -+ read_unlock(&tasklist_lock); -+ -+} -+ -+/* mark_pages_for_pageset2 -+ * -+ * Description: Mark unshared pages in processes not needed for hibernate as -+ * being able to be written out in a separate pagedir. -+ * HighMem pages are simply marked as pageset2. They won't be -+ * needed during hibernate. -+ */ -+ -+static void toi_mark_pages_for_pageset2(void) -+{ -+ struct attention_list *this = attention_list; -+ -+ memory_bm_clear(pageset2_map); -+ -+ if (test_action_state(TOI_NO_PAGESET2) || no_ps2_needed) -+ return; -+ -+ if (test_action_state(TOI_PAGESET2_FULL)) -+ pageset2_full(); -+ else -+ mark_tasks(PAGESET2); -+ -+ /* -+ * Because the tasks in attention_list are ones related to hibernating, -+ * we know that they won't go away under us. -+ */ -+ -+ while (this) { -+ if (!test_result_state(TOI_ABORTED)) -+ toi_mark_task_as_pageset(this->task, PAGESET1); -+ this = this->next; -+ } -+} -+ -+/* -+ * The atomic copy of pageset1 is stored in pageset2 pages. -+ * But if pageset1 is larger (normally only just after boot), -+ * we need to allocate extra pages to store the atomic copy. -+ * The following data struct and functions are used to handle -+ * the allocation and freeing of that memory. -+ */ -+ -+static unsigned long extra_pages_allocated; -+ -+struct extras { -+ struct page *page; -+ int order; -+ struct extras *next; -+}; -+ -+static struct extras *extras_list; -+ -+/* toi_free_extra_pagedir_memory -+ * -+ * Description: Free previously allocated extra pagedir memory. -+ */ -+void toi_free_extra_pagedir_memory(void) -+{ -+ /* Free allocated pages */ -+ while (extras_list) { -+ struct extras *this = extras_list; -+ int i; -+ -+ extras_list = this->next; -+ -+ for (i = 0; i < (1 << this->order); i++) -+ ClearPageNosave(this->page + i); -+ -+ toi_free_pages(9, this->page, this->order); -+ toi_kfree(7, this, sizeof(*this)); -+ } -+ -+ extra_pages_allocated = 0; -+} -+ -+/* toi_allocate_extra_pagedir_memory -+ * -+ * Description: Allocate memory for making the atomic copy of pagedir1 in the -+ * case where it is bigger than pagedir2. -+ * Arguments: int num_to_alloc: Number of extra pages needed. -+ * Result: int. Number of extra pages we now have allocated. -+ */ -+static int toi_allocate_extra_pagedir_memory(int extra_pages_needed) -+{ -+ int j, order, num_to_alloc = extra_pages_needed - extra_pages_allocated; -+ gfp_t flags = TOI_ATOMIC_GFP; -+ -+ if (num_to_alloc < 1) -+ return 0; -+ -+ order = fls(num_to_alloc); -+ if (order >= MAX_ORDER) -+ order = MAX_ORDER - 1; -+ -+ while (num_to_alloc) { -+ struct page *newpage; -+ unsigned long virt; -+ struct extras *extras_entry; -+ -+ while ((1 << order) > num_to_alloc) -+ order--; -+ -+ extras_entry = (struct extras *) toi_kzalloc(7, -+ sizeof(struct extras), TOI_ATOMIC_GFP); -+ -+ if (!extras_entry) -+ return extra_pages_allocated; -+ -+ virt = toi_get_free_pages(9, flags, order); -+ while (!virt && order) { -+ order--; -+ virt = toi_get_free_pages(9, flags, order); -+ } -+ -+ if (!virt) { -+ toi_kfree(7, extras_entry, sizeof(*extras_entry)); -+ return extra_pages_allocated; -+ } -+ -+ newpage = virt_to_page(virt); -+ -+ extras_entry->page = newpage; -+ extras_entry->order = order; -+ extras_entry->next = NULL; -+ -+ if (extras_list) -+ extras_entry->next = extras_list; -+ -+ extras_list = extras_entry; -+ -+ for (j = 0; j < (1 << order); j++) { -+ SetPageNosave(newpage + j); -+ SetPagePageset1Copy(newpage + j); -+ } -+ -+ extra_pages_allocated += (1 << order); -+ num_to_alloc -= (1 << order); -+ } -+ -+ return extra_pages_allocated; -+} -+ -+/* -+ * real_nr_free_pages: Count pcp pages for a zone type or all zones -+ * (-1 for all, otherwise zone_idx() result desired). -+ */ -+unsigned long real_nr_free_pages(unsigned long zone_idx_mask) -+{ -+ struct zone *zone; -+ int result = 0, cpu; -+ -+ /* PCP lists */ -+ for_each_populated_zone(zone) { -+ if (!(zone_idx_mask & (1 << zone_idx(zone)))) -+ continue; -+ -+ for_each_online_cpu(cpu) { -+ struct per_cpu_pageset *pset = zone_pcp(zone, cpu); -+ struct per_cpu_pages *pcp = &pset->pcp; -+ result += pcp->count; -+ } -+ -+ result += zone_page_state(zone, NR_FREE_PAGES); -+ } -+ return result; -+} -+EXPORT_SYMBOL_GPL(real_nr_free_pages); -+ -+/* -+ * Discover how much extra memory will be required by the drivers -+ * when they're asked to hibernate. We can then ensure that amount -+ * of memory is available when we really want it. -+ */ -+static void get_extra_pd1_allowance(void) -+{ -+ unsigned long orig_num_free = real_nr_free_pages(all_zones_mask), final; -+ -+ toi_prepare_status(CLEAR_BAR, "Finding allowance for drivers."); -+ -+ if (toi_go_atomic(PMSG_FREEZE, 1)) -+ return; -+ -+ final = real_nr_free_pages(all_zones_mask); -+ toi_end_atomic(ATOMIC_ALL_STEPS, 1, 0); -+ -+ extra_pd1_pages_allowance = (orig_num_free > final) ? -+ orig_num_free - final + MIN_EXTRA_PAGES_ALLOWANCE : -+ MIN_EXTRA_PAGES_ALLOWANCE; -+} -+ -+/* -+ * Amount of storage needed, possibly taking into account the -+ * expected compression ratio and possibly also ignoring our -+ * allowance for extra pages. -+ */ -+static unsigned long main_storage_needed(int use_ecr, -+ int ignore_extra_pd1_allow) -+{ -+ return (pagedir1.size + pagedir2.size + -+ (ignore_extra_pd1_allow ? 0 : extra_pd1_pages_allowance)) * -+ (use_ecr ? toi_expected_compression_ratio() : 100) / 100; -+} -+ -+/* -+ * Storage needed for the image header, in bytes until the return. -+ */ -+unsigned long get_header_storage_needed(void) -+{ -+ unsigned long bytes = sizeof(struct toi_header) + -+ toi_header_storage_for_modules() + -+ toi_pageflags_space_needed() + -+ fs_info_space_needed(); -+ -+ return DIV_ROUND_UP(bytes, PAGE_SIZE); -+} -+EXPORT_SYMBOL_GPL(get_header_storage_needed); -+ -+/* -+ * When freeing memory, pages from either pageset might be freed. -+ * -+ * When seeking to free memory to be able to hibernate, for every ps1 page -+ * freed, we need 2 less pages for the atomic copy because there is one less -+ * page to copy and one more page into which data can be copied. -+ * -+ * Freeing ps2 pages saves us nothing directly. No more memory is available -+ * for the atomic copy. Indirectly, a ps1 page might be freed (slab?), but -+ * that's too much work to figure out. -+ * -+ * => ps1_to_free functions -+ * -+ * Of course if we just want to reduce the image size, because of storage -+ * limitations or an image size limit either ps will do. -+ * -+ * => any_to_free function -+ */ -+ -+static unsigned long lowpages_usable_for_highmem_copy(void) -+{ -+ unsigned long needed = get_lowmem_size(pagedir1) + -+ extra_pd1_pages_allowance + MIN_FREE_RAM + -+ toi_memory_for_modules(0), -+ available = get_lowmem_size(pagedir2) + -+ real_nr_free_low_pages() + extra_pages_allocated; -+ -+ return available > needed ? available - needed : 0; -+} -+ -+static unsigned long highpages_ps1_to_free(void) -+{ -+ unsigned long need = get_highmem_size(pagedir1), -+ available = get_highmem_size(pagedir2) + -+ real_nr_free_high_pages() + -+ lowpages_usable_for_highmem_copy(); -+ -+ return need > available ? DIV_ROUND_UP(need - available, 2) : 0; -+} -+ -+static unsigned long lowpages_ps1_to_free(void) -+{ -+ unsigned long needed = get_lowmem_size(pagedir1) + -+ extra_pd1_pages_allowance + MIN_FREE_RAM + -+ toi_memory_for_modules(0), -+ available = get_lowmem_size(pagedir2) + -+ real_nr_free_low_pages() + extra_pages_allocated; -+ -+ return needed > available ? DIV_ROUND_UP(needed - available, 2) : 0; -+} -+ -+static unsigned long current_image_size(void) -+{ -+ return pagedir1.size + pagedir2.size + header_storage_needed; -+} -+ -+static unsigned long storage_still_required(void) -+{ -+ unsigned long needed = main_storage_needed(1, 1); -+ return needed > storage_limit ? needed - storage_limit : 0; -+} -+ -+static unsigned long ram_still_required(void) -+{ -+ unsigned long needed = MIN_FREE_RAM + toi_memory_for_modules(0) + -+ 2 * extra_pd1_pages_allowance, -+ available = real_nr_free_low_pages(); -+ return needed > available ? needed - available : 0; -+} -+ -+static unsigned long any_to_free(int use_image_size_limit) -+{ -+ int use_soft_limit = use_image_size_limit && image_size_limit > 0; -+ unsigned long current_size = current_image_size(), -+ soft_limit = use_soft_limit ? (image_size_limit << 8) : 0, -+ to_free = use_soft_limit ? (current_size > soft_limit ? -+ current_size - soft_limit : 0) : 0, -+ storage_limit = storage_still_required(), -+ ram_limit = ram_still_required(), -+ first_max = max(to_free, storage_limit); -+ -+ return max(first_max, ram_limit); -+} -+ -+static int need_pageset2(void) -+{ -+ return (real_nr_free_low_pages() + extra_pages_allocated - -+ 2 * extra_pd1_pages_allowance - MIN_FREE_RAM - -+ toi_memory_for_modules(0) - pagedir1.size) < pagedir2.size; -+} -+ -+/* amount_needed -+ * -+ * Calculates the amount by which the image size needs to be reduced to meet -+ * our constraints. -+ */ -+static unsigned long amount_needed(int use_image_size_limit) -+{ -+ return max(highpages_ps1_to_free() + lowpages_ps1_to_free(), -+ any_to_free(use_image_size_limit)); -+} -+ -+static int image_not_ready(int use_image_size_limit) -+{ -+ toi_message(TOI_EAT_MEMORY, TOI_LOW, 1, -+ "Amount still needed (%lu) > 0:%u," -+ " Storage allocd: %lu < %lu: %u.\n", -+ amount_needed(use_image_size_limit), -+ (amount_needed(use_image_size_limit) > 0), -+ main_storage_allocated, -+ main_storage_needed(1, 1), -+ main_storage_allocated < main_storage_needed(1, 1)); -+ -+ toi_cond_pause(0, NULL); -+ -+ return (amount_needed(use_image_size_limit) > 0) || -+ main_storage_allocated < main_storage_needed(1, 1); -+} -+ -+static void display_failure_reason(int tries_exceeded) -+{ -+ unsigned long storage_required = storage_still_required(), -+ ram_required = ram_still_required(), -+ high_ps1 = highpages_ps1_to_free(), -+ low_ps1 = lowpages_ps1_to_free(); -+ -+ printk(KERN_INFO "Failed to prepare the image because...\n"); -+ -+ if (!storage_limit) { -+ printk(KERN_INFO "- You need some storage available to be " -+ "able to hibernate.\n"); -+ return; -+ } -+ -+ if (tries_exceeded) -+ printk(KERN_INFO "- The maximum number of iterations was " -+ "reached without successfully preparing the " -+ "image.\n"); -+ -+ if (storage_required) { -+ printk(KERN_INFO " - We need at least %lu pages of storage " -+ "(ignoring the header), but only have %lu.\n", -+ main_storage_needed(1, 1), -+ main_storage_allocated); -+ set_abort_result(TOI_INSUFFICIENT_STORAGE); -+ } -+ -+ if (ram_required) { -+ printk(KERN_INFO " - We need %lu more free pages of low " -+ "memory.\n", ram_required); -+ printk(KERN_INFO " Minimum free : %8d\n", MIN_FREE_RAM); -+ printk(KERN_INFO " + Reqd. by modules : %8lu\n", -+ toi_memory_for_modules(0)); -+ printk(KERN_INFO " + 2 * extra allow : %8lu\n", -+ 2 * extra_pd1_pages_allowance); -+ printk(KERN_INFO " - Currently free : %8lu\n", -+ real_nr_free_low_pages()); -+ printk(KERN_INFO " : ========\n"); -+ printk(KERN_INFO " Still needed : %8lu\n", -+ ram_required); -+ -+ /* Print breakdown of memory needed for modules */ -+ toi_memory_for_modules(1); -+ set_abort_result(TOI_UNABLE_TO_FREE_ENOUGH_MEMORY); -+ } -+ -+ if (high_ps1) { -+ printk(KERN_INFO "- We need to free %lu highmem pageset 1 " -+ "pages.\n", high_ps1); -+ set_abort_result(TOI_UNABLE_TO_FREE_ENOUGH_MEMORY); -+ } -+ -+ if (low_ps1) { -+ printk(KERN_INFO " - We need to free %ld lowmem pageset 1 " -+ "pages.\n", low_ps1); -+ set_abort_result(TOI_UNABLE_TO_FREE_ENOUGH_MEMORY); -+ } -+} -+ -+static void display_stats(int always, int sub_extra_pd1_allow) -+{ -+ char buffer[255]; -+ snprintf(buffer, 254, -+ "Free:%lu(%lu). Sets:%lu(%lu),%lu(%lu). " -+ "Nosave:%lu-%lu=%lu. Storage:%lu/%lu(%lu=>%lu). " -+ "Needed:%lu,%lu,%lu(%u,%lu,%lu,%ld) (PS2:%s)\n", -+ -+ /* Free */ -+ real_nr_free_pages(all_zones_mask), -+ real_nr_free_low_pages(), -+ -+ /* Sets */ -+ pagedir1.size, pagedir1.size - get_highmem_size(pagedir1), -+ pagedir2.size, pagedir2.size - get_highmem_size(pagedir2), -+ -+ /* Nosave */ -+ num_nosave, extra_pages_allocated, -+ num_nosave - extra_pages_allocated, -+ -+ /* Storage */ -+ main_storage_allocated, -+ storage_limit, -+ main_storage_needed(1, sub_extra_pd1_allow), -+ main_storage_needed(1, 1), -+ -+ /* Needed */ -+ lowpages_ps1_to_free(), highpages_ps1_to_free(), -+ any_to_free(1), -+ MIN_FREE_RAM, toi_memory_for_modules(0), -+ extra_pd1_pages_allowance, -+ image_size_limit, -+ -+ need_pageset2() ? "yes" : "no"); -+ -+ if (always) -+ printk("%s", buffer); -+ else -+ toi_message(TOI_EAT_MEMORY, TOI_MEDIUM, 1, buffer); -+} -+ -+/* generate_free_page_map -+ * -+ * Description: This routine generates a bitmap of free pages from the -+ * lists used by the memory manager. We then use the bitmap -+ * to quickly calculate which pages to save and in which -+ * pagesets. -+ */ -+static void generate_free_page_map(void) -+{ -+ int order, cpu, t; -+ unsigned long flags, i; -+ struct zone *zone; -+ struct list_head *curr; -+ unsigned long pfn; -+ struct page *page; -+ -+ for_each_populated_zone(zone) { -+ -+ if (!zone->spanned_pages) -+ continue; -+ -+ spin_lock_irqsave(&zone->lock, flags); -+ -+ for (i = 0; i < zone->spanned_pages; i++) { -+ pfn = ZONE_START(zone) + i; -+ -+ if (!pfn_valid(pfn)) -+ continue; -+ -+ page = pfn_to_page(pfn); -+ -+ ClearPageNosaveFree(page); -+ } -+ -+ for_each_migratetype_order(order, t) { -+ list_for_each(curr, -+ &zone->free_area[order].free_list[t]) { -+ unsigned long j; -+ -+ pfn = page_to_pfn(list_entry(curr, struct page, -+ lru)); -+ for (j = 0; j < (1UL << order); j++) -+ SetPageNosaveFree(pfn_to_page(pfn + j)); -+ } -+ } -+ -+ for_each_online_cpu(cpu) { -+ struct per_cpu_pageset *pset = zone_pcp(zone, cpu); -+ struct per_cpu_pages *pcp = &pset->pcp; -+ struct page *page; -+ int t; -+ -+ for (t = 0; t < MIGRATE_PCPTYPES; t++) -+ list_for_each_entry(page, &pcp->lists[t], lru) -+ SetPageNosaveFree(page); -+ } -+ -+ spin_unlock_irqrestore(&zone->lock, flags); -+ } -+} -+ -+/* size_of_free_region -+ * -+ * Description: Return the number of pages that are free, beginning with and -+ * including this one. -+ */ -+static int size_of_free_region(struct zone *zone, unsigned long start_pfn) -+{ -+ unsigned long this_pfn = start_pfn, -+ end_pfn = ZONE_START(zone) + zone->spanned_pages - 1; -+ -+ while (this_pfn <= end_pfn && PageNosaveFree(pfn_to_page(this_pfn))) -+ this_pfn++; -+ -+ return this_pfn - start_pfn; -+} -+ -+/* flag_image_pages -+ * -+ * This routine generates our lists of pages to be stored in each -+ * pageset. Since we store the data using extents, and adding new -+ * extents might allocate a new extent page, this routine may well -+ * be called more than once. -+ */ -+static void flag_image_pages(int atomic_copy) -+{ -+ int num_free = 0; -+ unsigned long loop; -+ struct zone *zone; -+ -+ pagedir1.size = 0; -+ pagedir2.size = 0; -+ -+ set_highmem_size(pagedir1, 0); -+ set_highmem_size(pagedir2, 0); -+ -+ num_nosave = 0; -+ -+ memory_bm_clear(pageset1_map); -+ -+ generate_free_page_map(); -+ -+ /* -+ * Pages not to be saved are marked Nosave irrespective of being -+ * reserved. -+ */ -+ for_each_populated_zone(zone) { -+ int highmem = is_highmem(zone); -+ -+ for (loop = 0; loop < zone->spanned_pages; loop++) { -+ unsigned long pfn = ZONE_START(zone) + loop; -+ struct page *page; -+ int chunk_size; -+ -+ if (!pfn_valid(pfn)) -+ continue; -+ -+ chunk_size = size_of_free_region(zone, pfn); -+ if (chunk_size) { -+ num_free += chunk_size; -+ loop += chunk_size - 1; -+ continue; -+ } -+ -+ page = pfn_to_page(pfn); -+ -+ if (PageNosave(page)) { -+ num_nosave++; -+ continue; -+ } -+ -+ page = highmem ? saveable_highmem_page(zone, pfn) : -+ saveable_page(zone, pfn); -+ -+ if (!page) { -+ num_nosave++; -+ continue; -+ } -+ -+ if (PagePageset2(page)) { -+ pagedir2.size++; -+ if (PageHighMem(page)) -+ inc_highmem_size(pagedir2); -+ else -+ SetPagePageset1Copy(page); -+ if (PageResave(page)) { -+ SetPagePageset1(page); -+ ClearPagePageset1Copy(page); -+ pagedir1.size++; -+ if (PageHighMem(page)) -+ inc_highmem_size(pagedir1); -+ } -+ } else { -+ pagedir1.size++; -+ SetPagePageset1(page); -+ if (PageHighMem(page)) -+ inc_highmem_size(pagedir1); -+ } -+ } -+ } -+ -+ if (!atomic_copy) -+ toi_message(TOI_EAT_MEMORY, TOI_MEDIUM, 0, -+ "Count data pages: Set1 (%d) + Set2 (%d) + Nosave (%ld)" -+ " + NumFree (%d) = %d.\n", -+ pagedir1.size, pagedir2.size, num_nosave, num_free, -+ pagedir1.size + pagedir2.size + num_nosave + num_free); -+} -+ -+void toi_recalculate_image_contents(int atomic_copy) -+{ -+ memory_bm_clear(pageset1_map); -+ if (!atomic_copy) { -+ unsigned long pfn; -+ memory_bm_position_reset(pageset2_map); -+ for (pfn = memory_bm_next_pfn(pageset2_map); -+ pfn != BM_END_OF_MAP; -+ pfn = memory_bm_next_pfn(pageset2_map)) -+ ClearPagePageset1Copy(pfn_to_page(pfn)); -+ /* Need to call this before getting pageset1_size! */ -+ toi_mark_pages_for_pageset2(); -+ } -+ flag_image_pages(atomic_copy); -+ -+ if (!atomic_copy) { -+ storage_limit = toiActiveAllocator->storage_available(); -+ display_stats(0, 0); -+ } -+} -+ -+/* update_image -+ * -+ * Allocate [more] memory and storage for the image. -+ */ -+static void update_image(int ps2_recalc) -+{ -+ int old_header_req; -+ unsigned long seek, wanted, got; -+ -+ /* Include allowance for growth in pagedir1 while writing pagedir 2 */ -+ wanted = pagedir1.size + extra_pd1_pages_allowance - -+ get_lowmem_size(pagedir2); -+ if (wanted > extra_pages_allocated) { -+ got = toi_allocate_extra_pagedir_memory(wanted); -+ if (wanted < got) { -+ toi_message(TOI_EAT_MEMORY, TOI_LOW, 1, -+ "Want %d extra pages for pageset1, got %d.\n", -+ wanted, got); -+ return; -+ } -+ } -+ -+ if (ps2_recalc) -+ goto recalc; -+ -+ thaw_kernel_threads(); -+ -+ /* -+ * Allocate remaining storage space, if possible, up to the -+ * maximum we know we'll need. It's okay to allocate the -+ * maximum if the writer is the swapwriter, but -+ * we don't want to grab all available space on an NFS share. -+ * We therefore ignore the expected compression ratio here, -+ * thereby trying to allocate the maximum image size we could -+ * need (assuming compression doesn't expand the image), but -+ * don't complain if we can't get the full amount we're after. -+ */ -+ -+ do { -+ int result; -+ -+ old_header_req = header_storage_needed; -+ toiActiveAllocator->reserve_header_space(header_storage_needed); -+ -+ /* How much storage is free with the reservation applied? */ -+ storage_limit = toiActiveAllocator->storage_available(); -+ seek = min(storage_limit, main_storage_needed(0, 0)); -+ -+ result = toiActiveAllocator->allocate_storage(seek); -+ if (result) -+ printk("Failed to allocate storage (%d).\n", result); -+ -+ main_storage_allocated = -+ toiActiveAllocator->storage_allocated(); -+ -+ /* Need more header because more storage allocated? */ -+ header_storage_needed = get_header_storage_needed(); -+ -+ } while (header_storage_needed > old_header_req); -+ -+ if (freeze_processes()) -+ set_abort_result(TOI_FREEZING_FAILED); -+ -+recalc: -+ toi_recalculate_image_contents(0); -+} -+ -+/* attempt_to_freeze -+ * -+ * Try to freeze processes. -+ */ -+ -+static int attempt_to_freeze(void) -+{ -+ int result; -+ -+ /* Stop processes before checking again */ -+ thaw_processes(); -+ toi_prepare_status(CLEAR_BAR, "Freezing processes & syncing " -+ "filesystems."); -+ result = freeze_processes(); -+ -+ if (result) -+ set_abort_result(TOI_FREEZING_FAILED); -+ -+ return result; -+} -+ -+/* eat_memory -+ * -+ * Try to free some memory, either to meet hard or soft constraints on the image -+ * characteristics. -+ * -+ * Hard constraints: -+ * - Pageset1 must be < half of memory; -+ * - We must have enough memory free at resume time to have pageset1 -+ * be able to be loaded in pages that don't conflict with where it has to -+ * be restored. -+ * Soft constraints -+ * - User specificied image size limit. -+ */ -+static void eat_memory(void) -+{ -+ unsigned long amount_wanted = 0; -+ int did_eat_memory = 0; -+ -+ /* -+ * Note that if we have enough storage space and enough free memory, we -+ * may exit without eating anything. We give up when the last 10 -+ * iterations ate no extra pages because we're not going to get much -+ * more anyway, but the few pages we get will take a lot of time. -+ * -+ * We freeze processes before beginning, and then unfreeze them if we -+ * need to eat memory until we think we have enough. If our attempts -+ * to freeze fail, we give up and abort. -+ */ -+ -+ amount_wanted = amount_needed(1); -+ -+ switch (image_size_limit) { -+ case -1: /* Don't eat any memory */ -+ if (amount_wanted > 0) { -+ set_abort_result(TOI_WOULD_EAT_MEMORY); -+ return; -+ } -+ break; -+ case -2: /* Free caches only */ -+ drop_pagecache(); -+ toi_recalculate_image_contents(0); -+ amount_wanted = amount_needed(1); -+ break; -+ default: -+ break; -+ } -+ -+ if (amount_wanted > 0 && !test_result_state(TOI_ABORTED) && -+ image_size_limit != -1) { -+ unsigned long request = amount_wanted + 50; -+ -+ toi_prepare_status(CLEAR_BAR, -+ "Seeking to free %ldMB of memory.", -+ MB(amount_wanted)); -+ -+ thaw_kernel_threads(); -+ -+ /* -+ * Ask for too many because shrink_all_memory doesn't -+ * currently return enough most of the time. -+ */ -+ shrink_all_memory(request); -+ -+ did_eat_memory = 1; -+ -+ toi_recalculate_image_contents(0); -+ -+ amount_wanted = amount_needed(1); -+ -+ printk(KERN_DEBUG "Asked shrink_all_memory for %ld pages," -+ "got %ld.\n", request, -+ request - amount_wanted); -+ -+ toi_cond_pause(0, NULL); -+ -+ if (freeze_processes()) -+ set_abort_result(TOI_FREEZING_FAILED); -+ } -+ -+ if (did_eat_memory) -+ toi_recalculate_image_contents(0); -+} -+ -+/* toi_prepare_image -+ * -+ * Entry point to the whole image preparation section. -+ * -+ * We do four things: -+ * - Freeze processes; -+ * - Ensure image size constraints are met; -+ * - Complete all the preparation for saving the image, -+ * including allocation of storage. The only memory -+ * that should be needed when we're finished is that -+ * for actually storing the image (and we know how -+ * much is needed for that because the modules tell -+ * us). -+ * - Make sure that all dirty buffers are written out. -+ */ -+#define MAX_TRIES 2 -+int toi_prepare_image(void) -+{ -+ int result = 1, tries = 1; -+ -+ main_storage_allocated = 0; -+ no_ps2_needed = 0; -+ -+ if (attempt_to_freeze()) -+ return 1; -+ -+ if (!extra_pd1_pages_allowance) -+ get_extra_pd1_allowance(); -+ -+ storage_limit = toiActiveAllocator->storage_available(); -+ -+ if (!storage_limit) { -+ printk(KERN_INFO "No storage available. Didn't try to prepare " -+ "an image.\n"); -+ display_failure_reason(0); -+ set_abort_result(TOI_NOSTORAGE_AVAILABLE); -+ return 1; -+ } -+ -+ if (build_attention_list()) { -+ abort_hibernate(TOI_UNABLE_TO_PREPARE_IMAGE, -+ "Unable to successfully prepare the image.\n"); -+ return 1; -+ } -+ -+ toi_recalculate_image_contents(0); -+ -+ do { -+ toi_prepare_status(CLEAR_BAR, -+ "Preparing Image. Try %d.", tries); -+ -+ eat_memory(); -+ -+ if (test_result_state(TOI_ABORTED)) -+ break; -+ -+ update_image(0); -+ -+ tries++; -+ -+ } while (image_not_ready(1) && tries <= MAX_TRIES && -+ !test_result_state(TOI_ABORTED)); -+ -+ result = image_not_ready(0); -+ -+ if (!test_result_state(TOI_ABORTED)) { -+ if (result) { -+ display_stats(1, 0); -+ display_failure_reason(tries > MAX_TRIES); -+ abort_hibernate(TOI_UNABLE_TO_PREPARE_IMAGE, -+ "Unable to successfully prepare the image.\n"); -+ } else { -+ /* Pageset 2 needed? */ -+ if (!need_pageset2() && -+ test_action_state(TOI_NO_PS2_IF_UNNEEDED)) { -+ no_ps2_needed = 1; -+ toi_recalculate_image_contents(0); -+ update_image(1); -+ } -+ -+ toi_cond_pause(1, "Image preparation complete."); -+ } -+ } -+ -+ return result ? result : allocate_checksum_pages(); -+} -diff --git a/kernel/power/tuxonice_prepare_image.h b/kernel/power/tuxonice_prepare_image.h -new file mode 100644 -index 0000000..7b52e9e ---- /dev/null -+++ b/kernel/power/tuxonice_prepare_image.h -@@ -0,0 +1,36 @@ -+/* -+ * kernel/power/tuxonice_prepare_image.h -+ * -+ * Copyright (C) 2003-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ */ -+ -+#include -+ -+extern int toi_prepare_image(void); -+extern void toi_recalculate_image_contents(int storage_available); -+extern unsigned long real_nr_free_pages(unsigned long zone_idx_mask); -+extern long image_size_limit; -+extern void toi_free_extra_pagedir_memory(void); -+extern unsigned long extra_pd1_pages_allowance; -+extern void free_attention_list(void); -+ -+#define MIN_FREE_RAM 100 -+#define MIN_EXTRA_PAGES_ALLOWANCE 500 -+ -+#define all_zones_mask ((unsigned long) ((1 << MAX_NR_ZONES) - 1)) -+#ifdef CONFIG_HIGHMEM -+#define real_nr_free_high_pages() (real_nr_free_pages(1 << ZONE_HIGHMEM)) -+#define real_nr_free_low_pages() (real_nr_free_pages(all_zones_mask - \ -+ (1 << ZONE_HIGHMEM))) -+#else -+#define real_nr_free_high_pages() (0) -+#define real_nr_free_low_pages() (real_nr_free_pages(all_zones_mask)) -+ -+/* For eat_memory function */ -+#define ZONE_HIGHMEM (MAX_NR_ZONES + 1) -+#endif -+ -+unsigned long get_header_storage_needed(void); -diff --git a/kernel/power/tuxonice_storage.c b/kernel/power/tuxonice_storage.c -new file mode 100644 -index 0000000..be962ee ---- /dev/null -+++ b/kernel/power/tuxonice_storage.c -@@ -0,0 +1,282 @@ -+/* -+ * kernel/power/tuxonice_storage.c -+ * -+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Routines for talking to a userspace program that manages storage. -+ * -+ * The kernel side: -+ * - starts the userspace program; -+ * - sends messages telling it when to open and close the connection; -+ * - tells it when to quit; -+ * -+ * The user space side: -+ * - passes messages regarding status; -+ * -+ */ -+ -+#include -+#include -+ -+#include "tuxonice_sysfs.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_netlink.h" -+#include "tuxonice_storage.h" -+#include "tuxonice_ui.h" -+ -+static struct user_helper_data usm_helper_data; -+static struct toi_module_ops usm_ops; -+static int message_received, usm_prepare_count; -+static int storage_manager_last_action, storage_manager_action; -+ -+static int usm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh) -+{ -+ int type; -+ int *data; -+ -+ type = nlh->nlmsg_type; -+ -+ /* A control message: ignore them */ -+ if (type < NETLINK_MSG_BASE) -+ return 0; -+ -+ /* Unknown message: reply with EINVAL */ -+ if (type >= USM_MSG_MAX) -+ return -EINVAL; -+ -+ /* All operations require privileges, even GET */ -+ if (security_netlink_recv(skb, CAP_NET_ADMIN)) -+ return -EPERM; -+ -+ /* Only allow one task to receive NOFREEZE privileges */ -+ if (type == NETLINK_MSG_NOFREEZE_ME && usm_helper_data.pid != -1) -+ return -EBUSY; -+ -+ data = (int *) NLMSG_DATA(nlh); -+ -+ switch (type) { -+ case USM_MSG_SUCCESS: -+ case USM_MSG_FAILED: -+ message_received = type; -+ complete(&usm_helper_data.wait_for_process); -+ break; -+ default: -+ printk(KERN_INFO "Storage manager doesn't recognise " -+ "message %d.\n", type); -+ } -+ -+ return 1; -+} -+ -+#ifdef CONFIG_NET -+static int activations; -+ -+int toi_activate_storage(int force) -+{ -+ int tries = 1; -+ -+ if (usm_helper_data.pid == -1 || !usm_ops.enabled) -+ return 0; -+ -+ message_received = 0; -+ activations++; -+ -+ if (activations > 1 && !force) -+ return 0; -+ -+ while ((!message_received || message_received == USM_MSG_FAILED) && -+ tries < 2) { -+ toi_prepare_status(DONT_CLEAR_BAR, "Activate storage attempt " -+ "%d.\n", tries); -+ -+ init_completion(&usm_helper_data.wait_for_process); -+ -+ toi_send_netlink_message(&usm_helper_data, -+ USM_MSG_CONNECT, -+ NULL, 0); -+ -+ /* Wait 2 seconds for the userspace process to make contact */ -+ wait_for_completion_timeout(&usm_helper_data.wait_for_process, -+ 2*HZ); -+ -+ tries++; -+ } -+ -+ return 0; -+} -+ -+int toi_deactivate_storage(int force) -+{ -+ if (usm_helper_data.pid == -1 || !usm_ops.enabled) -+ return 0; -+ -+ message_received = 0; -+ activations--; -+ -+ if (activations && !force) -+ return 0; -+ -+ init_completion(&usm_helper_data.wait_for_process); -+ -+ toi_send_netlink_message(&usm_helper_data, -+ USM_MSG_DISCONNECT, -+ NULL, 0); -+ -+ wait_for_completion_timeout(&usm_helper_data.wait_for_process, 2*HZ); -+ -+ if (!message_received || message_received == USM_MSG_FAILED) { -+ printk(KERN_INFO "Returning failure disconnecting storage.\n"); -+ return 1; -+ } -+ -+ return 0; -+} -+#endif -+ -+static void storage_manager_simulate(void) -+{ -+ printk(KERN_INFO "--- Storage manager simulate ---\n"); -+ toi_prepare_usm(); -+ schedule(); -+ printk(KERN_INFO "--- Activate storage 1 ---\n"); -+ toi_activate_storage(1); -+ schedule(); -+ printk(KERN_INFO "--- Deactivate storage 1 ---\n"); -+ toi_deactivate_storage(1); -+ schedule(); -+ printk(KERN_INFO "--- Cleanup usm ---\n"); -+ toi_cleanup_usm(); -+ schedule(); -+ printk(KERN_INFO "--- Storage manager simulate ends ---\n"); -+} -+ -+static int usm_storage_needed(void) -+{ -+ return strlen(usm_helper_data.program); -+} -+ -+static int usm_save_config_info(char *buf) -+{ -+ int len = strlen(usm_helper_data.program); -+ memcpy(buf, usm_helper_data.program, len); -+ return len; -+} -+ -+static void usm_load_config_info(char *buf, int size) -+{ -+ /* Don't load the saved path if one has already been set */ -+ if (usm_helper_data.program[0]) -+ return; -+ -+ memcpy(usm_helper_data.program, buf, size); -+} -+ -+static int usm_memory_needed(void) -+{ -+ /* ball park figure of 32 pages */ -+ return 32 * PAGE_SIZE; -+} -+ -+/* toi_prepare_usm -+ */ -+int toi_prepare_usm(void) -+{ -+ usm_prepare_count++; -+ -+ if (usm_prepare_count > 1 || !usm_ops.enabled) -+ return 0; -+ -+ usm_helper_data.pid = -1; -+ -+ if (!*usm_helper_data.program) -+ return 0; -+ -+ toi_netlink_setup(&usm_helper_data); -+ -+ if (usm_helper_data.pid == -1) -+ printk(KERN_INFO "TuxOnIce Storage Manager wanted, but couldn't" -+ " start it.\n"); -+ -+ toi_activate_storage(0); -+ -+ return usm_helper_data.pid != -1; -+} -+ -+void toi_cleanup_usm(void) -+{ -+ usm_prepare_count--; -+ -+ if (usm_helper_data.pid > -1 && !usm_prepare_count) { -+ toi_deactivate_storage(0); -+ toi_netlink_close(&usm_helper_data); -+ } -+} -+ -+static void storage_manager_activate(void) -+{ -+ if (storage_manager_action == storage_manager_last_action) -+ return; -+ -+ if (storage_manager_action) -+ toi_prepare_usm(); -+ else -+ toi_cleanup_usm(); -+ -+ storage_manager_last_action = storage_manager_action; -+} -+ -+/* -+ * User interface specific /sys/power/tuxonice entries. -+ */ -+ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_NONE("simulate_atomic_copy", storage_manager_simulate), -+ SYSFS_INT("enabled", SYSFS_RW, &usm_ops.enabled, 0, 1, 0, NULL), -+ SYSFS_STRING("program", SYSFS_RW, usm_helper_data.program, 254, 0, -+ NULL), -+ SYSFS_INT("activate_storage", SYSFS_RW , &storage_manager_action, 0, 1, -+ 0, storage_manager_activate) -+}; -+ -+static struct toi_module_ops usm_ops = { -+ .type = MISC_MODULE, -+ .name = "usm", -+ .directory = "storage_manager", -+ .module = THIS_MODULE, -+ .storage_needed = usm_storage_needed, -+ .save_config_info = usm_save_config_info, -+ .load_config_info = usm_load_config_info, -+ .memory_needed = usm_memory_needed, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+/* toi_usm_sysfs_init -+ * Description: Boot time initialisation for user interface. -+ */ -+int toi_usm_init(void) -+{ -+ usm_helper_data.nl = NULL; -+ usm_helper_data.program[0] = '\0'; -+ usm_helper_data.pid = -1; -+ usm_helper_data.skb_size = 0; -+ usm_helper_data.pool_limit = 6; -+ usm_helper_data.netlink_id = NETLINK_TOI_USM; -+ usm_helper_data.name = "userspace storage manager"; -+ usm_helper_data.rcv_msg = usm_user_rcv_msg; -+ usm_helper_data.interface_version = 2; -+ usm_helper_data.must_init = 0; -+ init_completion(&usm_helper_data.wait_for_process); -+ -+ return toi_register_module(&usm_ops); -+} -+ -+void toi_usm_exit(void) -+{ -+ toi_netlink_close_complete(&usm_helper_data); -+ toi_unregister_module(&usm_ops); -+} -diff --git a/kernel/power/tuxonice_storage.h b/kernel/power/tuxonice_storage.h -new file mode 100644 -index 0000000..8c6b5a7 ---- /dev/null -+++ b/kernel/power/tuxonice_storage.h -@@ -0,0 +1,45 @@ -+/* -+ * kernel/power/tuxonice_storage.h -+ * -+ * Copyright (C) 2005-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ */ -+ -+#ifdef CONFIG_NET -+int toi_prepare_usm(void); -+void toi_cleanup_usm(void); -+ -+int toi_activate_storage(int force); -+int toi_deactivate_storage(int force); -+extern int toi_usm_init(void); -+extern void toi_usm_exit(void); -+#else -+static inline int toi_usm_init(void) { return 0; } -+static inline void toi_usm_exit(void) { } -+ -+static inline int toi_activate_storage(int force) -+{ -+ return 0; -+} -+ -+static inline int toi_deactivate_storage(int force) -+{ -+ return 0; -+} -+ -+static inline int toi_prepare_usm(void) { return 0; } -+static inline void toi_cleanup_usm(void) { } -+#endif -+ -+enum { -+ USM_MSG_BASE = 0x10, -+ -+ /* Kernel -> Userspace */ -+ USM_MSG_CONNECT = 0x30, -+ USM_MSG_DISCONNECT = 0x31, -+ USM_MSG_SUCCESS = 0x40, -+ USM_MSG_FAILED = 0x41, -+ -+ USM_MSG_MAX, -+}; -diff --git a/kernel/power/tuxonice_swap.c b/kernel/power/tuxonice_swap.c -new file mode 100644 -index 0000000..f55ef5e ---- /dev/null -+++ b/kernel/power/tuxonice_swap.c -@@ -0,0 +1,487 @@ -+/* -+ * kernel/power/tuxonice_swap.c -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * Distributed under GPLv2. -+ * -+ * This file encapsulates functions for usage of swap space as a -+ * backing store. -+ */ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice.h" -+#include "tuxonice_sysfs.h" -+#include "tuxonice_modules.h" -+#include "tuxonice_io.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_extent.h" -+#include "tuxonice_bio.h" -+#include "tuxonice_alloc.h" -+#include "tuxonice_builtin.h" -+ -+static struct toi_module_ops toi_swapops; -+ -+/* For swapfile automatically swapon/off'd. */ -+static char swapfilename[255] = ""; -+static int toi_swapon_status; -+ -+/* Swap Pages */ -+static unsigned long swap_allocated; -+ -+static struct sysinfo swapinfo; -+ -+/** -+ * enable_swapfile: Swapon the user specified swapfile prior to hibernating. -+ * -+ * Activate the given swapfile if it wasn't already enabled. Remember whether -+ * we really did swapon it for swapoffing later. -+ */ -+static void enable_swapfile(void) -+{ -+ int activateswapresult = -EINVAL; -+ -+ if (swapfilename[0]) { -+ /* Attempt to swap on with maximum priority */ -+ activateswapresult = sys_swapon(swapfilename, 0xFFFF); -+ if (activateswapresult && activateswapresult != -EBUSY) -+ printk(KERN_ERR "TuxOnIce: The swapfile/partition " -+ "specified by /sys/power/tuxonice/swap/swapfile" -+ " (%s) could not be turned on (error %d). " -+ "Attempting to continue.\n", -+ swapfilename, activateswapresult); -+ if (!activateswapresult) -+ toi_swapon_status = 1; -+ } -+} -+ -+/** -+ * disable_swapfile: Swapoff any file swaponed at the start of the cycle. -+ * -+ * If we did successfully swapon a file at the start of the cycle, swapoff -+ * it now (finishing up). -+ */ -+static void disable_swapfile(void) -+{ -+ if (!toi_swapon_status) -+ return; -+ -+ sys_swapoff(swapfilename); -+ toi_swapon_status = 0; -+} -+ -+static int add_blocks_to_extent_chain(struct toi_bdev_info *chain, -+ unsigned long start, unsigned long end) -+{ -+ if (test_action_state(TOI_TEST_BIO)) -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Adding extent %lu-%lu to " -+ "chain %p.", start << chain->bmap_shift, -+ end << chain->bmap_shift, chain); -+ -+ return toi_add_to_extent_chain(&chain->blocks, start, end); -+} -+ -+ -+static int get_main_pool_phys_params(struct toi_bdev_info *chain) -+{ -+ struct hibernate_extent *extentpointer = NULL; -+ unsigned long address, extent_min = 0, extent_max = 0; -+ int empty = 1; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "get main pool phys params for " -+ "chain %d.", chain->allocator_index); -+ -+ if (!chain->allocations.first) -+ return 0; -+ -+ if (chain->blocks.first) -+ toi_put_extent_chain(&chain->blocks); -+ -+ toi_extent_for_each(&chain->allocations, extentpointer, address) { -+ swp_entry_t swap_address = (swp_entry_t) { address }; -+ struct block_device *bdev; -+ sector_t new_sector = map_swap_entry(swap_address, &bdev); -+ -+ if (empty) { -+ empty = 0; -+ extent_min = extent_max = new_sector; -+ continue; -+ } -+ -+ if (new_sector == extent_max + 1) { -+ extent_max++; -+ continue; -+ } -+ -+ if (add_blocks_to_extent_chain(chain, extent_min, extent_max)) { -+ printk(KERN_ERR "Out of memory while making block " -+ "chains.\n"); -+ return -ENOMEM; -+ } -+ -+ extent_min = new_sector; -+ extent_max = new_sector; -+ } -+ -+ if (!empty && -+ add_blocks_to_extent_chain(chain, extent_min, extent_max)) { -+ printk(KERN_ERR "Out of memory while making block chains.\n"); -+ return -ENOMEM; -+ } -+ -+ return 0; -+} -+ -+/* -+ * Like si_swapinfo, except that we don't include ram backed swap (compcache!) -+ * and don't need to use the spinlocks (userspace is stopped when this -+ * function is called). -+ */ -+void si_swapinfo_no_compcache(void) -+{ -+ unsigned int i; -+ -+ si_swapinfo(&swapinfo); -+ swapinfo.freeswap = 0; -+ swapinfo.totalswap = 0; -+ -+ for (i = 0; i < MAX_SWAPFILES; i++) { -+ struct swap_info_struct *si = get_swap_info_struct(i); -+ if (si && (si->flags & SWP_WRITEOK) && -+ (strncmp(si->bdev->bd_disk->disk_name, "ram", 3))) { -+ swapinfo.totalswap += si->inuse_pages; -+ swapinfo.freeswap += si->pages - si->inuse_pages; -+ } -+ } -+} -+/* -+ * We can't just remember the value from allocation time, because other -+ * processes might have allocated swap in the mean time. -+ */ -+static unsigned long toi_swap_storage_available(void) -+{ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "In toi_swap_storage_available."); -+ si_swapinfo_no_compcache(); -+ return swapinfo.freeswap + swap_allocated; -+} -+ -+static int toi_swap_initialise(int starting_cycle) -+{ -+ if (!starting_cycle) -+ return 0; -+ -+ enable_swapfile(); -+ return 0; -+} -+ -+static void toi_swap_cleanup(int ending_cycle) -+{ -+ if (ending_cycle) -+ disable_swapfile(); -+} -+ -+static void toi_swap_free_storage(struct toi_bdev_info *chain) -+{ -+ /* Free swap entries */ -+ struct hibernate_extent *extentpointer; -+ unsigned long extentvalue; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Freeing storage for chain %p.", -+ chain); -+ -+ swap_allocated -= chain->allocations.size; -+ toi_extent_for_each(&chain->allocations, extentpointer, extentvalue) -+ swap_free((swp_entry_t) { extentvalue }); -+ -+ toi_put_extent_chain(&chain->allocations); -+} -+ -+static void free_swap_range(unsigned long min, unsigned long max) -+{ -+ int j; -+ -+ for (j = min; j <= max; j++) -+ swap_free((swp_entry_t) { j }); -+ swap_allocated -= (max - min + 1); -+} -+ -+/* -+ * Allocation of a single swap type. Swap priorities are handled at the higher -+ * level. -+ */ -+static int toi_swap_allocate_storage(struct toi_bdev_info *chain, -+ unsigned long request) -+{ -+ int to_add = 0; -+ unsigned long gotten = 0; -+ unsigned long extent_min = 0, extent_max = 0; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " Swap allocate storage: Asked to" -+ " allocate %lu pages from device %d.", request, -+ chain->allocator_index); -+ -+ while (gotten < request) { -+ swp_entry_t entry; -+ unsigned long new_value; -+ -+ entry = get_swap_page_of_type(chain->allocator_index); -+ if (!entry.val) -+ break; -+ -+ swap_allocated++; -+ new_value = entry.val; -+ gotten++; -+ -+ if (!to_add) { -+ to_add = 1; -+ extent_min = new_value; -+ extent_max = new_value; -+ continue; -+ } -+ -+ if (new_value == extent_max + 1) { -+ extent_max++; -+ continue; -+ } -+ -+ if (toi_add_to_extent_chain(&chain->allocations, extent_min, -+ extent_max)) { -+ printk(KERN_INFO "Failed to allocate extent for " -+ "%lu-%lu.\n", extent_min, extent_max); -+ free_swap_range(extent_min, extent_max); -+ swap_free(entry); -+ gotten -= (extent_max - extent_min); -+ /* Don't try to add again below */ -+ to_add = 0; -+ break; -+ } -+ -+ extent_min = new_value; -+ extent_max = new_value; -+ } -+ -+ if (to_add) { -+ int this_result = toi_add_to_extent_chain(&chain->allocations, -+ extent_min, extent_max); -+ -+ if (this_result) { -+ free_swap_range(extent_min, extent_max); -+ gotten -= (extent_max - extent_min + 1); -+ } -+ } -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, " Allocated %lu pages.", gotten); -+ return gotten; -+} -+ -+static int toi_swap_register_storage(void) -+{ -+ int i, result = 0; -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "toi_swap_register_storage."); -+ for (i = 0; i < MAX_SWAPFILES; i++) { -+ struct swap_info_struct *si = get_swap_info_struct(i); -+ struct toi_bdev_info *devinfo; -+ unsigned char *p; -+ unsigned char buf[256]; -+ struct fs_info *fs_info; -+ -+ if (!si || !(si->flags & SWP_WRITEOK) || -+ !strncmp(si->bdev->bd_disk->disk_name, "ram", 3)) -+ continue; -+ -+ devinfo = toi_kzalloc(39, sizeof(struct toi_bdev_info), -+ GFP_ATOMIC); -+ if (!devinfo) { -+ printk("Failed to allocate devinfo struct for swap " -+ "device %d.\n", i); -+ return -ENOMEM; -+ } -+ -+ devinfo->bdev = si->bdev; -+ devinfo->allocator = &toi_swapops; -+ devinfo->allocator_index = i; -+ -+ fs_info = fs_info_from_block_dev(si->bdev); -+ if (fs_info && !IS_ERR(fs_info)) { -+ memcpy(devinfo->uuid, &fs_info->uuid, 16); -+ free_fs_info(fs_info); -+ } else -+ result = (int) PTR_ERR(fs_info); -+ -+ if (!fs_info) -+ printk("fs_info from block dev returned %d.\n", result); -+ devinfo->dev_t = si->bdev->bd_dev; -+ devinfo->prio = si->prio; -+ devinfo->bmap_shift = 3; -+ devinfo->blocks_per_page = 1; -+ -+ p = d_path(&si->swap_file->f_path, buf, sizeof(buf)); -+ sprintf(devinfo->name, "swap on %s", p); -+ -+ toi_message(TOI_IO, TOI_VERBOSE, 0, "Registering swap storage:" -+ " Device %d (%lx), prio %d.", i, -+ (unsigned long) devinfo->dev_t, devinfo->prio); -+ toi_bio_ops.register_storage(devinfo); -+ } -+ -+ return 0; -+} -+ -+/* -+ * workspace_size -+ * -+ * Description: -+ * Returns the number of bytes of RAM needed for this -+ * code to do its work. (Used when calculating whether -+ * we have enough memory to be able to hibernate & resume). -+ * -+ */ -+static int toi_swap_memory_needed(void) -+{ -+ return 1; -+} -+ -+/* -+ * Print debug info -+ * -+ * Description: -+ */ -+static int toi_swap_print_debug_stats(char *buffer, int size) -+{ -+ int len = 0; -+ -+ len = scnprintf(buffer, size, "- Swap Allocator enabled.\n"); -+ if (swapfilename[0]) -+ len += scnprintf(buffer+len, size-len, -+ " Attempting to automatically swapon: %s.\n", -+ swapfilename); -+ -+ si_swapinfo_no_compcache(); -+ -+ len += scnprintf(buffer+len, size-len, -+ " Swap available for image: %lu pages.\n", -+ swapinfo.freeswap + swap_allocated); -+ -+ return len; -+} -+ -+static int header_locations_read_sysfs(const char *page, int count) -+{ -+ int i, printedpartitionsmessage = 0, len = 0, haveswap = 0; -+ struct inode *swapf = NULL; -+ int zone; -+ char *path_page = (char *) toi_get_free_page(10, GFP_KERNEL); -+ char *path, *output = (char *) page; -+ int path_len; -+ -+ if (!page) -+ return 0; -+ -+ for (i = 0; i < MAX_SWAPFILES; i++) { -+ struct swap_info_struct *si = get_swap_info_struct(i); -+ -+ if (!si || !(si->flags & SWP_WRITEOK)) -+ continue; -+ -+ if (S_ISBLK(si->swap_file->f_mapping->host->i_mode)) { -+ haveswap = 1; -+ if (!printedpartitionsmessage) { -+ len += sprintf(output + len, -+ "For swap partitions, simply use the " -+ "format: resume=swap:/dev/hda1.\n"); -+ printedpartitionsmessage = 1; -+ } -+ } else { -+ path_len = 0; -+ -+ path = d_path(&si->swap_file->f_path, path_page, -+ PAGE_SIZE); -+ path_len = snprintf(path_page, PAGE_SIZE, "%s", path); -+ -+ haveswap = 1; -+ swapf = si->swap_file->f_mapping->host; -+ zone = bmap(swapf, 0); -+ if (!zone) { -+ len += sprintf(output + len, -+ "Swapfile %s has been corrupted. Reuse" -+ " mkswap on it and try again.\n", -+ path_page); -+ } else { -+ char name_buffer[BDEVNAME_SIZE]; -+ len += sprintf(output + len, -+ "For swapfile `%s`," -+ " use resume=swap:/dev/%s:0x%x.\n", -+ path_page, -+ bdevname(si->bdev, name_buffer), -+ zone << (swapf->i_blkbits - 9)); -+ } -+ } -+ } -+ -+ if (!haveswap) -+ len = sprintf(output, "You need to turn on swap partitions " -+ "before examining this file.\n"); -+ -+ toi_free_page(10, (unsigned long) path_page); -+ return len; -+} -+ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_STRING("swapfilename", SYSFS_RW, swapfilename, 255, 0, NULL), -+ SYSFS_CUSTOM("headerlocations", SYSFS_READONLY, -+ header_locations_read_sysfs, NULL, 0, NULL), -+ SYSFS_INT("enabled", SYSFS_RW, &toi_swapops.enabled, 0, 1, 0, -+ attempt_to_parse_resume_device2), -+}; -+ -+static struct toi_bio_allocator_ops toi_bio_swapops = { -+ .register_storage = toi_swap_register_storage, -+ .storage_available = toi_swap_storage_available, -+ .allocate_storage = toi_swap_allocate_storage, -+ .bmap = get_main_pool_phys_params, -+ .free_storage = toi_swap_free_storage, -+}; -+ -+static struct toi_module_ops toi_swapops = { -+ .type = BIO_ALLOCATOR_MODULE, -+ .name = "swap storage", -+ .directory = "swap", -+ .module = THIS_MODULE, -+ .memory_needed = toi_swap_memory_needed, -+ .print_debug_info = toi_swap_print_debug_stats, -+ .initialise = toi_swap_initialise, -+ .cleanup = toi_swap_cleanup, -+ .bio_allocator_ops = &toi_bio_swapops, -+ -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+/* ---- Registration ---- */ -+static __init int toi_swap_load(void) -+{ -+ return toi_register_module(&toi_swapops); -+} -+ -+#ifdef MODULE -+static __exit void toi_swap_unload(void) -+{ -+ toi_unregister_module(&toi_swapops); -+} -+ -+module_init(toi_swap_load); -+module_exit(toi_swap_unload); -+MODULE_LICENSE("GPL"); -+MODULE_AUTHOR("Nigel Cunningham"); -+MODULE_DESCRIPTION("TuxOnIce SwapAllocator"); -+#else -+late_initcall(toi_swap_load); -+#endif -diff --git a/kernel/power/tuxonice_sysfs.c b/kernel/power/tuxonice_sysfs.c -new file mode 100644 -index 0000000..0088409 ---- /dev/null -+++ b/kernel/power/tuxonice_sysfs.c -@@ -0,0 +1,335 @@ -+/* -+ * kernel/power/tuxonice_sysfs.c -+ * -+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * This file contains support for sysfs entries for tuning TuxOnIce. -+ * -+ * We have a generic handler that deals with the most common cases, and -+ * hooks for special handlers to use. -+ */ -+ -+#include -+ -+#include "tuxonice_sysfs.h" -+#include "tuxonice.h" -+#include "tuxonice_storage.h" -+#include "tuxonice_alloc.h" -+ -+static int toi_sysfs_initialised; -+ -+static void toi_initialise_sysfs(void); -+ -+static struct toi_sysfs_data sysfs_params[]; -+ -+#define to_sysfs_data(_attr) container_of(_attr, struct toi_sysfs_data, attr) -+ -+static void toi_main_wrapper(void) -+{ -+ toi_try_hibernate(); -+} -+ -+static ssize_t toi_attr_show(struct kobject *kobj, struct attribute *attr, -+ char *page) -+{ -+ struct toi_sysfs_data *sysfs_data = to_sysfs_data(attr); -+ int len = 0; -+ int full_prep = sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ; -+ -+ if (full_prep && toi_start_anything(0)) -+ return -EBUSY; -+ -+ if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ) -+ toi_prepare_usm(); -+ -+ switch (sysfs_data->type) { -+ case TOI_SYSFS_DATA_CUSTOM: -+ len = (sysfs_data->data.special.read_sysfs) ? -+ (sysfs_data->data.special.read_sysfs)(page, PAGE_SIZE) -+ : 0; -+ break; -+ case TOI_SYSFS_DATA_BIT: -+ len = sprintf(page, "%d\n", -+ -test_bit(sysfs_data->data.bit.bit, -+ sysfs_data->data.bit.bit_vector)); -+ break; -+ case TOI_SYSFS_DATA_INTEGER: -+ len = sprintf(page, "%d\n", -+ *(sysfs_data->data.integer.variable)); -+ break; -+ case TOI_SYSFS_DATA_LONG: -+ len = sprintf(page, "%ld\n", -+ *(sysfs_data->data.a_long.variable)); -+ break; -+ case TOI_SYSFS_DATA_UL: -+ len = sprintf(page, "%lu\n", -+ *(sysfs_data->data.ul.variable)); -+ break; -+ case TOI_SYSFS_DATA_STRING: -+ len = sprintf(page, "%s\n", -+ sysfs_data->data.string.variable); -+ break; -+ } -+ -+ if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ) -+ toi_cleanup_usm(); -+ -+ if (full_prep) -+ toi_finish_anything(0); -+ -+ return len; -+} -+ -+#define BOUND(_variable, _type) do { \ -+ if (*_variable < sysfs_data->data._type.minimum) \ -+ *_variable = sysfs_data->data._type.minimum; \ -+ else if (*_variable > sysfs_data->data._type.maximum) \ -+ *_variable = sysfs_data->data._type.maximum; \ -+} while (0) -+ -+static ssize_t toi_attr_store(struct kobject *kobj, struct attribute *attr, -+ const char *my_buf, size_t count) -+{ -+ int assigned_temp_buffer = 0, result = count; -+ struct toi_sysfs_data *sysfs_data = to_sysfs_data(attr); -+ -+ if (toi_start_anything((sysfs_data->flags & SYSFS_HIBERNATE_OR_RESUME))) -+ return -EBUSY; -+ -+ ((char *) my_buf)[count] = 0; -+ -+ if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_WRITE) -+ toi_prepare_usm(); -+ -+ switch (sysfs_data->type) { -+ case TOI_SYSFS_DATA_CUSTOM: -+ if (sysfs_data->data.special.write_sysfs) -+ result = (sysfs_data->data.special.write_sysfs)(my_buf, -+ count); -+ break; -+ case TOI_SYSFS_DATA_BIT: -+ { -+ unsigned long value; -+ result = strict_strtoul(my_buf, 0, &value); -+ if (result) -+ break; -+ if (value) -+ set_bit(sysfs_data->data.bit.bit, -+ (sysfs_data->data.bit.bit_vector)); -+ else -+ clear_bit(sysfs_data->data.bit.bit, -+ (sysfs_data->data.bit.bit_vector)); -+ } -+ break; -+ case TOI_SYSFS_DATA_INTEGER: -+ { -+ long temp; -+ result = strict_strtol(my_buf, 0, &temp); -+ if (result) -+ break; -+ *(sysfs_data->data.integer.variable) = (int) temp; -+ BOUND(sysfs_data->data.integer.variable, integer); -+ break; -+ } -+ case TOI_SYSFS_DATA_LONG: -+ { -+ long *variable = -+ sysfs_data->data.a_long.variable; -+ result = strict_strtol(my_buf, 0, variable); -+ if (result) -+ break; -+ BOUND(variable, a_long); -+ break; -+ } -+ case TOI_SYSFS_DATA_UL: -+ { -+ unsigned long *variable = -+ sysfs_data->data.ul.variable; -+ result = strict_strtoul(my_buf, 0, variable); -+ if (result) -+ break; -+ BOUND(variable, ul); -+ break; -+ } -+ break; -+ case TOI_SYSFS_DATA_STRING: -+ { -+ int copy_len = count; -+ char *variable = -+ sysfs_data->data.string.variable; -+ -+ if (sysfs_data->data.string.max_length && -+ (copy_len > sysfs_data->data.string.max_length)) -+ copy_len = sysfs_data->data.string.max_length; -+ -+ if (!variable) { -+ variable = (char *) toi_get_zeroed_page(31, -+ TOI_ATOMIC_GFP); -+ sysfs_data->data.string.variable = variable; -+ assigned_temp_buffer = 1; -+ } -+ strncpy(variable, my_buf, copy_len); -+ if (copy_len && my_buf[copy_len - 1] == '\n') -+ variable[count - 1] = 0; -+ variable[count] = 0; -+ } -+ break; -+ } -+ -+ if (!result) -+ result = count; -+ -+ /* Side effect routine? */ -+ if (result == count && sysfs_data->write_side_effect) -+ sysfs_data->write_side_effect(); -+ -+ /* Free temporary buffers */ -+ if (assigned_temp_buffer) { -+ toi_free_page(31, -+ (unsigned long) sysfs_data->data.string.variable); -+ sysfs_data->data.string.variable = NULL; -+ } -+ -+ if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_WRITE) -+ toi_cleanup_usm(); -+ -+ toi_finish_anything(sysfs_data->flags & SYSFS_HIBERNATE_OR_RESUME); -+ -+ return result; -+} -+ -+static struct sysfs_ops toi_sysfs_ops = { -+ .show = &toi_attr_show, -+ .store = &toi_attr_store, -+}; -+ -+static struct kobj_type toi_ktype = { -+ .sysfs_ops = &toi_sysfs_ops, -+}; -+ -+struct kobject *tuxonice_kobj; -+ -+/* Non-module sysfs entries. -+ * -+ * This array contains entries that are automatically registered at -+ * boot. Modules and the console code register their own entries separately. -+ */ -+ -+static struct toi_sysfs_data sysfs_params[] = { -+ SYSFS_CUSTOM("do_hibernate", SYSFS_WRITEONLY, NULL, NULL, -+ SYSFS_HIBERNATING, toi_main_wrapper), -+ SYSFS_CUSTOM("do_resume", SYSFS_WRITEONLY, NULL, NULL, -+ SYSFS_RESUMING, toi_try_resume) -+}; -+ -+void remove_toi_sysdir(struct kobject *kobj) -+{ -+ if (!kobj) -+ return; -+ -+ kobject_put(kobj); -+} -+ -+struct kobject *make_toi_sysdir(char *name) -+{ -+ struct kobject *kobj = kobject_create_and_add(name, tuxonice_kobj); -+ -+ if (!kobj) { -+ printk(KERN_INFO "TuxOnIce: Can't allocate kobject for sysfs " -+ "dir!\n"); -+ return NULL; -+ } -+ -+ kobj->ktype = &toi_ktype; -+ -+ return kobj; -+} -+ -+/* toi_register_sysfs_file -+ * -+ * Helper for registering a new /sysfs/tuxonice entry. -+ */ -+ -+int toi_register_sysfs_file( -+ struct kobject *kobj, -+ struct toi_sysfs_data *toi_sysfs_data) -+{ -+ int result; -+ -+ if (!toi_sysfs_initialised) -+ toi_initialise_sysfs(); -+ -+ result = sysfs_create_file(kobj, &toi_sysfs_data->attr); -+ if (result) -+ printk(KERN_INFO "TuxOnIce: sysfs_create_file for %s " -+ "returned %d.\n", -+ toi_sysfs_data->attr.name, result); -+ kobj->ktype = &toi_ktype; -+ -+ return result; -+} -+EXPORT_SYMBOL_GPL(toi_register_sysfs_file); -+ -+/* toi_unregister_sysfs_file -+ * -+ * Helper for removing unwanted /sys/power/tuxonice entries. -+ * -+ */ -+void toi_unregister_sysfs_file(struct kobject *kobj, -+ struct toi_sysfs_data *toi_sysfs_data) -+{ -+ sysfs_remove_file(kobj, &toi_sysfs_data->attr); -+} -+EXPORT_SYMBOL_GPL(toi_unregister_sysfs_file); -+ -+void toi_cleanup_sysfs(void) -+{ -+ int i, -+ numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data); -+ -+ if (!toi_sysfs_initialised) -+ return; -+ -+ for (i = 0; i < numfiles; i++) -+ toi_unregister_sysfs_file(tuxonice_kobj, &sysfs_params[i]); -+ -+ kobject_put(tuxonice_kobj); -+ toi_sysfs_initialised = 0; -+} -+ -+/* toi_initialise_sysfs -+ * -+ * Initialise the /sysfs/tuxonice directory. -+ */ -+ -+static void toi_initialise_sysfs(void) -+{ -+ int i; -+ int numfiles = sizeof(sysfs_params) / sizeof(struct toi_sysfs_data); -+ -+ if (toi_sysfs_initialised) -+ return; -+ -+ /* Make our TuxOnIce directory a child of /sys/power */ -+ tuxonice_kobj = kobject_create_and_add("tuxonice", power_kobj); -+ if (!tuxonice_kobj) -+ return; -+ -+ toi_sysfs_initialised = 1; -+ -+ for (i = 0; i < numfiles; i++) -+ toi_register_sysfs_file(tuxonice_kobj, &sysfs_params[i]); -+} -+ -+int toi_sysfs_init(void) -+{ -+ toi_initialise_sysfs(); -+ return 0; -+} -+ -+void toi_sysfs_exit(void) -+{ -+ toi_cleanup_sysfs(); -+} -diff --git a/kernel/power/tuxonice_sysfs.h b/kernel/power/tuxonice_sysfs.h -new file mode 100644 -index 0000000..4185c6d ---- /dev/null -+++ b/kernel/power/tuxonice_sysfs.h -@@ -0,0 +1,137 @@ -+/* -+ * kernel/power/tuxonice_sysfs.h -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ */ -+ -+#include -+ -+struct toi_sysfs_data { -+ struct attribute attr; -+ int type; -+ int flags; -+ union { -+ struct { -+ unsigned long *bit_vector; -+ int bit; -+ } bit; -+ struct { -+ int *variable; -+ int minimum; -+ int maximum; -+ } integer; -+ struct { -+ long *variable; -+ long minimum; -+ long maximum; -+ } a_long; -+ struct { -+ unsigned long *variable; -+ unsigned long minimum; -+ unsigned long maximum; -+ } ul; -+ struct { -+ char *variable; -+ int max_length; -+ } string; -+ struct { -+ int (*read_sysfs) (const char *buffer, int count); -+ int (*write_sysfs) (const char *buffer, int count); -+ void *data; -+ } special; -+ } data; -+ -+ /* Side effects routine. Used, eg, for reparsing the -+ * resume= entry when it changes */ -+ void (*write_side_effect) (void); -+ struct list_head sysfs_data_list; -+}; -+ -+enum { -+ TOI_SYSFS_DATA_NONE = 1, -+ TOI_SYSFS_DATA_CUSTOM, -+ TOI_SYSFS_DATA_BIT, -+ TOI_SYSFS_DATA_INTEGER, -+ TOI_SYSFS_DATA_UL, -+ TOI_SYSFS_DATA_LONG, -+ TOI_SYSFS_DATA_STRING -+}; -+ -+#define SYSFS_WRITEONLY 0200 -+#define SYSFS_READONLY 0444 -+#define SYSFS_RW 0644 -+ -+#define SYSFS_BIT(_name, _mode, _ul, _bit, _flags) { \ -+ .attr = {.name = _name , .mode = _mode }, \ -+ .type = TOI_SYSFS_DATA_BIT, \ -+ .flags = _flags, \ -+ .data = { .bit = { .bit_vector = _ul, .bit = _bit } } } -+ -+#define SYSFS_INT(_name, _mode, _int, _min, _max, _flags, _wse) { \ -+ .attr = {.name = _name , .mode = _mode }, \ -+ .type = TOI_SYSFS_DATA_INTEGER, \ -+ .flags = _flags, \ -+ .data = { .integer = { .variable = _int, .minimum = _min, \ -+ .maximum = _max } }, \ -+ .write_side_effect = _wse } -+ -+#define SYSFS_UL(_name, _mode, _ul, _min, _max, _flags) { \ -+ .attr = {.name = _name , .mode = _mode }, \ -+ .type = TOI_SYSFS_DATA_UL, \ -+ .flags = _flags, \ -+ .data = { .ul = { .variable = _ul, .minimum = _min, \ -+ .maximum = _max } } } -+ -+#define SYSFS_LONG(_name, _mode, _long, _min, _max, _flags) { \ -+ .attr = {.name = _name , .mode = _mode }, \ -+ .type = TOI_SYSFS_DATA_LONG, \ -+ .flags = _flags, \ -+ .data = { .a_long = { .variable = _long, .minimum = _min, \ -+ .maximum = _max } } } -+ -+#define SYSFS_STRING(_name, _mode, _string, _max_len, _flags, _wse) { \ -+ .attr = {.name = _name , .mode = _mode }, \ -+ .type = TOI_SYSFS_DATA_STRING, \ -+ .flags = _flags, \ -+ .data = { .string = { .variable = _string, .max_length = _max_len } }, \ -+ .write_side_effect = _wse } -+ -+#define SYSFS_CUSTOM(_name, _mode, _read, _write, _flags, _wse) { \ -+ .attr = {.name = _name , .mode = _mode }, \ -+ .type = TOI_SYSFS_DATA_CUSTOM, \ -+ .flags = _flags, \ -+ .data = { .special = { .read_sysfs = _read, .write_sysfs = _write } }, \ -+ .write_side_effect = _wse } -+ -+#define SYSFS_NONE(_name, _wse) { \ -+ .attr = {.name = _name , .mode = SYSFS_WRITEONLY }, \ -+ .type = TOI_SYSFS_DATA_NONE, \ -+ .write_side_effect = _wse, \ -+} -+ -+/* Flags */ -+#define SYSFS_NEEDS_SM_FOR_READ 1 -+#define SYSFS_NEEDS_SM_FOR_WRITE 2 -+#define SYSFS_HIBERNATE 4 -+#define SYSFS_RESUME 8 -+#define SYSFS_HIBERNATE_OR_RESUME (SYSFS_HIBERNATE | SYSFS_RESUME) -+#define SYSFS_HIBERNATING (SYSFS_HIBERNATE | SYSFS_NEEDS_SM_FOR_WRITE) -+#define SYSFS_RESUMING (SYSFS_RESUME | SYSFS_NEEDS_SM_FOR_WRITE) -+#define SYSFS_NEEDS_SM_FOR_BOTH \ -+ (SYSFS_NEEDS_SM_FOR_READ | SYSFS_NEEDS_SM_FOR_WRITE) -+ -+int toi_register_sysfs_file(struct kobject *kobj, -+ struct toi_sysfs_data *toi_sysfs_data); -+void toi_unregister_sysfs_file(struct kobject *kobj, -+ struct toi_sysfs_data *toi_sysfs_data); -+ -+extern struct kobject *tuxonice_kobj; -+ -+struct kobject *make_toi_sysdir(char *name); -+void remove_toi_sysdir(struct kobject *obj); -+extern void toi_cleanup_sysfs(void); -+ -+extern int toi_sysfs_init(void); -+extern void toi_sysfs_exit(void); -diff --git a/kernel/power/tuxonice_ui.c b/kernel/power/tuxonice_ui.c -new file mode 100644 -index 0000000..b0b3b40 ---- /dev/null -+++ b/kernel/power/tuxonice_ui.c -@@ -0,0 +1,250 @@ -+/* -+ * kernel/power/tuxonice_ui.c -+ * -+ * Copyright (C) 1998-2001 Gabor Kuti -+ * Copyright (C) 1998,2001,2002 Pavel Machek -+ * Copyright (C) 2002-2003 Florent Chabaud -+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Routines for TuxOnIce's user interface. -+ * -+ * The user interface code talks to a userspace program via a -+ * netlink socket. -+ * -+ * The kernel side: -+ * - starts the userui program; -+ * - sends text messages and progress bar status; -+ * -+ * The user space side: -+ * - passes messages regarding user requests (abort, toggle reboot etc) -+ * -+ */ -+ -+#define __KERNEL_SYSCALLS__ -+ -+#include -+ -+#include "tuxonice_sysfs.h" -+#include "tuxonice_modules.h" -+#include "tuxonice.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_netlink.h" -+#include "tuxonice_power_off.h" -+#include "tuxonice_builtin.h" -+ -+static char local_printf_buf[1024]; /* Same as printk - should be safe */ -+struct ui_ops *toi_current_ui; -+EXPORT_SYMBOL_GPL(toi_current_ui); -+ -+/** -+ * toi_wait_for_keypress - Wait for keypress via userui or /dev/console. -+ * -+ * @timeout: Maximum time to wait. -+ * -+ * Wait for a keypress, either from userui or /dev/console if userui isn't -+ * available. The non-userui path is particularly for at boot-time, prior -+ * to userui being started, when we have an important warning to give to -+ * the user. -+ */ -+static char toi_wait_for_keypress(int timeout) -+{ -+ if (toi_current_ui && toi_current_ui->wait_for_key(timeout)) -+ return ' '; -+ -+ return toi_wait_for_keypress_dev_console(timeout); -+} -+ -+/* toi_early_boot_message() -+ * Description: Handle errors early in the process of booting. -+ * The user may press C to continue booting, perhaps -+ * invalidating the image, or space to reboot. -+ * This works from either the serial console or normally -+ * attached keyboard. -+ * -+ * Note that we come in here from init, while the kernel is -+ * locked. If we want to get events from the serial console, -+ * we need to temporarily unlock the kernel. -+ * -+ * toi_early_boot_message may also be called post-boot. -+ * In this case, it simply printks the message and returns. -+ * -+ * Arguments: int Whether we are able to erase the image. -+ * int default_answer. What to do when we timeout. This -+ * will normally be continue, but the user might -+ * provide command line options (__setup) to override -+ * particular cases. -+ * Char *. Pointer to a string explaining why we're moaning. -+ */ -+ -+#define say(message, a...) printk(KERN_EMERG message, ##a) -+ -+void toi_early_boot_message(int message_detail, int default_answer, -+ char *warning_reason, ...) -+{ -+#if defined(CONFIG_VT) || defined(CONFIG_SERIAL_CONSOLE) -+ unsigned long orig_state = get_toi_state(), continue_req = 0; -+ unsigned long orig_loglevel = console_loglevel; -+ int can_ask = 1; -+#else -+ int can_ask = 0; -+#endif -+ -+ va_list args; -+ int printed_len; -+ -+ if (!toi_wait) { -+ set_toi_state(TOI_CONTINUE_REQ); -+ can_ask = 0; -+ } -+ -+ if (warning_reason) { -+ va_start(args, warning_reason); -+ printed_len = vsnprintf(local_printf_buf, -+ sizeof(local_printf_buf), -+ warning_reason, -+ args); -+ va_end(args); -+ } -+ -+ if (!test_toi_state(TOI_BOOT_TIME)) { -+ printk("TuxOnIce: %s\n", local_printf_buf); -+ return; -+ } -+ -+ if (!can_ask) { -+ continue_req = !!default_answer; -+ goto post_ask; -+ } -+ -+#if defined(CONFIG_VT) || defined(CONFIG_SERIAL_CONSOLE) -+ console_loglevel = 7; -+ -+ say("=== TuxOnIce ===\n\n"); -+ if (warning_reason) { -+ say("BIG FAT WARNING!! %s\n\n", local_printf_buf); -+ switch (message_detail) { -+ case 0: -+ say("If you continue booting, note that any image WILL" -+ "NOT BE REMOVED.\nTuxOnIce is unable to do so " -+ "because the appropriate modules aren't\n" -+ "loaded. You should manually remove the image " -+ "to avoid any\npossibility of corrupting your " -+ "filesystem(s) later.\n"); -+ break; -+ case 1: -+ say("If you want to use the current TuxOnIce image, " -+ "reboot and try\nagain with the same kernel " -+ "that you hibernated from. If you want\n" -+ "to forget that image, continue and the image " -+ "will be erased.\n"); -+ break; -+ } -+ say("Press SPACE to reboot or C to continue booting with " -+ "this kernel\n\n"); -+ if (toi_wait > 0) -+ say("Default action if you don't select one in %d " -+ "seconds is: %s.\n", -+ toi_wait, -+ default_answer == TOI_CONTINUE_REQ ? -+ "continue booting" : "reboot"); -+ } else { -+ say("BIG FAT WARNING!!\n\n" -+ "You have tried to resume from this image before.\n" -+ "If it failed once, it may well fail again.\n" -+ "Would you like to remove the image and boot " -+ "normally?\nThis will be equivalent to entering " -+ "noresume on the\nkernel command line.\n\n" -+ "Press SPACE to remove the image or C to continue " -+ "resuming.\n\n"); -+ if (toi_wait > 0) -+ say("Default action if you don't select one in %d " -+ "seconds is: %s.\n", toi_wait, -+ !!default_answer ? -+ "continue resuming" : "remove the image"); -+ } -+ console_loglevel = orig_loglevel; -+ -+ set_toi_state(TOI_SANITY_CHECK_PROMPT); -+ clear_toi_state(TOI_CONTINUE_REQ); -+ -+ if (toi_wait_for_keypress(toi_wait) == 0) /* We timed out */ -+ continue_req = !!default_answer; -+ else -+ continue_req = test_toi_state(TOI_CONTINUE_REQ); -+ -+#endif /* CONFIG_VT or CONFIG_SERIAL_CONSOLE */ -+ -+post_ask: -+ if ((warning_reason) && (!continue_req)) -+ machine_restart(NULL); -+ -+ restore_toi_state(orig_state); -+ if (continue_req) -+ set_toi_state(TOI_CONTINUE_REQ); -+} -+EXPORT_SYMBOL_GPL(toi_early_boot_message); -+#undef say -+ -+/* -+ * User interface specific /sys/power/tuxonice entries. -+ */ -+ -+static struct toi_sysfs_data sysfs_params[] = { -+#if defined(CONFIG_NET) && defined(CONFIG_SYSFS) -+ SYSFS_INT("default_console_level", SYSFS_RW, -+ &toi_bkd.toi_default_console_level, 0, 7, 0, NULL), -+ SYSFS_UL("debug_sections", SYSFS_RW, &toi_bkd.toi_debug_state, 0, -+ 1 << 30, 0), -+ SYSFS_BIT("log_everything", SYSFS_RW, &toi_bkd.toi_action, TOI_LOGALL, -+ 0) -+#endif -+}; -+ -+static struct toi_module_ops userui_ops = { -+ .type = MISC_HIDDEN_MODULE, -+ .name = "printk ui", -+ .directory = "user_interface", -+ .module = THIS_MODULE, -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+int toi_register_ui_ops(struct ui_ops *this_ui) -+{ -+ if (toi_current_ui) { -+ printk(KERN_INFO "Only one TuxOnIce user interface module can " -+ "be loaded at a time."); -+ return -EBUSY; -+ } -+ -+ toi_current_ui = this_ui; -+ -+ return 0; -+} -+EXPORT_SYMBOL_GPL(toi_register_ui_ops); -+ -+void toi_remove_ui_ops(struct ui_ops *this_ui) -+{ -+ if (toi_current_ui != this_ui) -+ return; -+ -+ toi_current_ui = NULL; -+} -+EXPORT_SYMBOL_GPL(toi_remove_ui_ops); -+ -+/* toi_console_sysfs_init -+ * Description: Boot time initialisation for user interface. -+ */ -+ -+int toi_ui_init(void) -+{ -+ return toi_register_module(&userui_ops); -+} -+ -+void toi_ui_exit(void) -+{ -+ toi_unregister_module(&userui_ops); -+} -diff --git a/kernel/power/tuxonice_ui.h b/kernel/power/tuxonice_ui.h -new file mode 100644 -index 0000000..85fb7cb ---- /dev/null -+++ b/kernel/power/tuxonice_ui.h -@@ -0,0 +1,97 @@ -+/* -+ * kernel/power/tuxonice_ui.h -+ * -+ * Copyright (C) 2004-2010 Nigel Cunningham (nigel at tuxonice net) -+ */ -+ -+enum { -+ DONT_CLEAR_BAR, -+ CLEAR_BAR -+}; -+ -+enum { -+ /* Userspace -> Kernel */ -+ USERUI_MSG_ABORT = 0x11, -+ USERUI_MSG_SET_STATE = 0x12, -+ USERUI_MSG_GET_STATE = 0x13, -+ USERUI_MSG_GET_DEBUG_STATE = 0x14, -+ USERUI_MSG_SET_DEBUG_STATE = 0x15, -+ USERUI_MSG_SPACE = 0x18, -+ USERUI_MSG_GET_POWERDOWN_METHOD = 0x1A, -+ USERUI_MSG_SET_POWERDOWN_METHOD = 0x1B, -+ USERUI_MSG_GET_LOGLEVEL = 0x1C, -+ USERUI_MSG_SET_LOGLEVEL = 0x1D, -+ USERUI_MSG_PRINTK = 0x1E, -+ -+ /* Kernel -> Userspace */ -+ USERUI_MSG_MESSAGE = 0x21, -+ USERUI_MSG_PROGRESS = 0x22, -+ USERUI_MSG_POST_ATOMIC_RESTORE = 0x25, -+ -+ USERUI_MSG_MAX, -+}; -+ -+struct userui_msg_params { -+ u32 a, b, c, d; -+ char text[255]; -+}; -+ -+struct ui_ops { -+ char (*wait_for_key) (int timeout); -+ u32 (*update_status) (u32 value, u32 maximum, const char *fmt, ...); -+ void (*prepare_status) (int clearbar, const char *fmt, ...); -+ void (*cond_pause) (int pause, char *message); -+ void (*abort)(int result_code, const char *fmt, ...); -+ void (*prepare)(void); -+ void (*cleanup)(void); -+ void (*message)(u32 section, u32 level, u32 normally_logged, -+ const char *fmt, ...); -+}; -+ -+extern struct ui_ops *toi_current_ui; -+ -+#define toi_update_status(val, max, fmt, args...) \ -+ (toi_current_ui ? (toi_current_ui->update_status) (val, max, fmt, ##args) : \ -+ max) -+ -+#define toi_prepare_console(void) \ -+ do { if (toi_current_ui) \ -+ (toi_current_ui->prepare)(); \ -+ } while (0) -+ -+#define toi_cleanup_console(void) \ -+ do { if (toi_current_ui) \ -+ (toi_current_ui->cleanup)(); \ -+ } while (0) -+ -+#define abort_hibernate(result, fmt, args...) \ -+ do { if (toi_current_ui) \ -+ (toi_current_ui->abort)(result, fmt, ##args); \ -+ else { \ -+ set_abort_result(result); \ -+ } \ -+ } while (0) -+ -+#define toi_cond_pause(pause, message) \ -+ do { if (toi_current_ui) \ -+ (toi_current_ui->cond_pause)(pause, message); \ -+ } while (0) -+ -+#define toi_prepare_status(clear, fmt, args...) \ -+ do { if (toi_current_ui) \ -+ (toi_current_ui->prepare_status)(clear, fmt, ##args); \ -+ else \ -+ printk(KERN_ERR fmt "%s", ##args, "\n"); \ -+ } while (0) -+ -+#define toi_message(sn, lev, log, fmt, a...) \ -+do { \ -+ if (toi_current_ui && (!sn || test_debug_state(sn))) \ -+ toi_current_ui->message(sn, lev, log, fmt, ##a); \ -+} while (0) -+ -+__exit void toi_ui_cleanup(void); -+extern int toi_ui_init(void); -+extern void toi_ui_exit(void); -+extern int toi_register_ui_ops(struct ui_ops *this_ui); -+extern void toi_remove_ui_ops(struct ui_ops *this_ui); -diff --git a/kernel/power/tuxonice_userui.c b/kernel/power/tuxonice_userui.c -new file mode 100644 -index 0000000..625d863 ---- /dev/null -+++ b/kernel/power/tuxonice_userui.c -@@ -0,0 +1,668 @@ -+/* -+ * kernel/power/user_ui.c -+ * -+ * Copyright (C) 2005-2007 Bernard Blackham -+ * Copyright (C) 2002-2010 Nigel Cunningham (nigel at tuxonice net) -+ * -+ * This file is released under the GPLv2. -+ * -+ * Routines for TuxOnIce's user interface. -+ * -+ * The user interface code talks to a userspace program via a -+ * netlink socket. -+ * -+ * The kernel side: -+ * - starts the userui program; -+ * - sends text messages and progress bar status; -+ * -+ * The user space side: -+ * - passes messages regarding user requests (abort, toggle reboot etc) -+ * -+ */ -+ -+#define __KERNEL_SYSCALLS__ -+ -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+#include -+ -+#include "tuxonice_sysfs.h" -+#include "tuxonice_modules.h" -+#include "tuxonice.h" -+#include "tuxonice_ui.h" -+#include "tuxonice_netlink.h" -+#include "tuxonice_power_off.h" -+ -+static char local_printf_buf[1024]; /* Same as printk - should be safe */ -+ -+static struct user_helper_data ui_helper_data; -+static struct toi_module_ops userui_ops; -+static int orig_kmsg; -+ -+static char lastheader[512]; -+static int lastheader_message_len; -+static int ui_helper_changed; /* Used at resume-time so don't overwrite value -+ set from initrd/ramfs. */ -+ -+/* Number of distinct progress amounts that userspace can display */ -+static int progress_granularity = 30; -+ -+static DECLARE_WAIT_QUEUE_HEAD(userui_wait_for_key); -+ -+/** -+ * ui_nl_set_state - Update toi_action based on a message from userui. -+ * -+ * @n: The bit (1 << bit) to set. -+ */ -+static void ui_nl_set_state(int n) -+{ -+ /* Only let them change certain settings */ -+ static const u32 toi_action_mask = -+ (1 << TOI_REBOOT) | (1 << TOI_PAUSE) | -+ (1 << TOI_LOGALL) | -+ (1 << TOI_SINGLESTEP) | -+ (1 << TOI_PAUSE_NEAR_PAGESET_END); -+ static unsigned long new_action; -+ -+ new_action = (toi_bkd.toi_action & (~toi_action_mask)) | -+ (n & toi_action_mask); -+ -+ printk(KERN_DEBUG "n is %x. Action flags being changed from %lx " -+ "to %lx.", n, toi_bkd.toi_action, new_action); -+ toi_bkd.toi_action = new_action; -+ -+ if (!test_action_state(TOI_PAUSE) && -+ !test_action_state(TOI_SINGLESTEP)) -+ wake_up_interruptible(&userui_wait_for_key); -+} -+ -+/** -+ * userui_post_atomic_restore - Tell userui that atomic restore just happened. -+ * -+ * Tell userui that atomic restore just occured, so that it can do things like -+ * redrawing the screen, re-getting settings and so on. -+ */ -+static void userui_post_atomic_restore(struct toi_boot_kernel_data *bkd) -+{ -+ toi_send_netlink_message(&ui_helper_data, -+ USERUI_MSG_POST_ATOMIC_RESTORE, NULL, 0); -+} -+ -+/** -+ * userui_storage_needed - Report how much memory in image header is needed. -+ */ -+static int userui_storage_needed(void) -+{ -+ return sizeof(ui_helper_data.program) + 1 + sizeof(int); -+} -+ -+/** -+ * userui_save_config_info - Fill buffer with config info for image header. -+ * -+ * @buf: Buffer into which to put the config info we want to save. -+ */ -+static int userui_save_config_info(char *buf) -+{ -+ *((int *) buf) = progress_granularity; -+ memcpy(buf + sizeof(int), ui_helper_data.program, -+ sizeof(ui_helper_data.program)); -+ return sizeof(ui_helper_data.program) + sizeof(int) + 1; -+} -+ -+/** -+ * userui_load_config_info - Restore config info from buffer. -+ * -+ * @buf: Buffer containing header info loaded. -+ * @size: Size of data loaded for this module. -+ */ -+static void userui_load_config_info(char *buf, int size) -+{ -+ progress_granularity = *((int *) buf); -+ size -= sizeof(int); -+ -+ /* Don't load the saved path if one has already been set */ -+ if (ui_helper_changed) -+ return; -+ -+ if (size > sizeof(ui_helper_data.program)) -+ size = sizeof(ui_helper_data.program); -+ -+ memcpy(ui_helper_data.program, buf + sizeof(int), size); -+ ui_helper_data.program[sizeof(ui_helper_data.program)-1] = '\0'; -+} -+ -+/** -+ * set_ui_program_set: Record that userui program was changed. -+ * -+ * Side effect routine for when the userui program is set. In an initrd or -+ * ramfs, the user may set a location for the userui program. If this happens, -+ * we don't want to reload the value that was saved in the image header. This -+ * routine allows us to flag that we shouldn't restore the program name from -+ * the image header. -+ */ -+static void set_ui_program_set(void) -+{ -+ ui_helper_changed = 1; -+} -+ -+/** -+ * userui_memory_needed - Tell core how much memory to reserve for us. -+ */ -+static int userui_memory_needed(void) -+{ -+ /* ball park figure of 128 pages */ -+ return 128 * PAGE_SIZE; -+} -+ -+/** -+ * userui_update_status - Update the progress bar and (if on) in-bar message. -+ * -+ * @value: Current progress percentage numerator. -+ * @maximum: Current progress percentage denominator. -+ * @fmt: Message to be displayed in the middle of the progress bar. -+ * -+ * Note that a NULL message does not mean that any previous message is erased! -+ * For that, you need toi_prepare_status with clearbar on. -+ * -+ * Returns an unsigned long, being the next numerator (as determined by the -+ * maximum and progress granularity) where status needs to be updated. -+ * This is to reduce unnecessary calls to update_status. -+ */ -+static u32 userui_update_status(u32 value, u32 maximum, const char *fmt, ...) -+{ -+ static u32 last_step = 9999; -+ struct userui_msg_params msg; -+ u32 this_step, next_update; -+ int bitshift; -+ -+ if (ui_helper_data.pid == -1) -+ return 0; -+ -+ if ((!maximum) || (!progress_granularity)) -+ return maximum; -+ -+ if (value < 0) -+ value = 0; -+ -+ if (value > maximum) -+ value = maximum; -+ -+ /* Try to avoid math problems - we can't do 64 bit math here -+ * (and shouldn't need it - anyone got screen resolution -+ * of 65536 pixels or more?) */ -+ bitshift = fls(maximum) - 16; -+ if (bitshift > 0) { -+ u32 temp_maximum = maximum >> bitshift; -+ u32 temp_value = value >> bitshift; -+ this_step = (u32) -+ (temp_value * progress_granularity / temp_maximum); -+ next_update = (((this_step + 1) * temp_maximum / -+ progress_granularity) + 1) << bitshift; -+ } else { -+ this_step = (u32) (value * progress_granularity / maximum); -+ next_update = ((this_step + 1) * maximum / -+ progress_granularity) + 1; -+ } -+ -+ if (this_step == last_step) -+ return next_update; -+ -+ memset(&msg, 0, sizeof(msg)); -+ -+ msg.a = this_step; -+ msg.b = progress_granularity; -+ -+ if (fmt) { -+ va_list args; -+ va_start(args, fmt); -+ vsnprintf(msg.text, sizeof(msg.text), fmt, args); -+ va_end(args); -+ msg.text[sizeof(msg.text)-1] = '\0'; -+ } -+ -+ toi_send_netlink_message(&ui_helper_data, USERUI_MSG_PROGRESS, -+ &msg, sizeof(msg)); -+ last_step = this_step; -+ -+ return next_update; -+} -+ -+/** -+ * userui_message - Display a message without necessarily logging it. -+ * -+ * @section: Type of message. Messages can be filtered by type. -+ * @level: Degree of importance of the message. Lower values = higher priority. -+ * @normally_logged: Whether logged even if log_everything is off. -+ * @fmt: Message (and parameters). -+ * -+ * This function is intended to do the same job as printk, but without normally -+ * logging what is printed. The point is to be able to get debugging info on -+ * screen without filling the logs with "1/534. ^M 2/534^M. 3/534^M" -+ * -+ * It may be called from an interrupt context - can't sleep! -+ */ -+static void userui_message(u32 section, u32 level, u32 normally_logged, -+ const char *fmt, ...) -+{ -+ struct userui_msg_params msg; -+ -+ if ((level) && (level > console_loglevel)) -+ return; -+ -+ memset(&msg, 0, sizeof(msg)); -+ -+ msg.a = section; -+ msg.b = level; -+ msg.c = normally_logged; -+ -+ if (fmt) { -+ va_list args; -+ va_start(args, fmt); -+ vsnprintf(msg.text, sizeof(msg.text), fmt, args); -+ va_end(args); -+ msg.text[sizeof(msg.text)-1] = '\0'; -+ } -+ -+ if (test_action_state(TOI_LOGALL)) -+ printk(KERN_INFO "%s\n", msg.text); -+ -+ toi_send_netlink_message(&ui_helper_data, USERUI_MSG_MESSAGE, -+ &msg, sizeof(msg)); -+} -+ -+/** -+ * wait_for_key_via_userui - Wait for userui to receive a keypress. -+ */ -+static void wait_for_key_via_userui(void) -+{ -+ DECLARE_WAITQUEUE(wait, current); -+ -+ add_wait_queue(&userui_wait_for_key, &wait); -+ set_current_state(TASK_INTERRUPTIBLE); -+ -+ interruptible_sleep_on(&userui_wait_for_key); -+ -+ set_current_state(TASK_RUNNING); -+ remove_wait_queue(&userui_wait_for_key, &wait); -+} -+ -+/** -+ * userui_prepare_status - Display high level messages. -+ * -+ * @clearbar: Whether to clear the progress bar. -+ * @fmt...: New message for the title. -+ * -+ * Prepare the 'nice display', drawing the header and version, along with the -+ * current action and perhaps also resetting the progress bar. -+ */ -+static void userui_prepare_status(int clearbar, const char *fmt, ...) -+{ -+ va_list args; -+ -+ if (fmt) { -+ va_start(args, fmt); -+ lastheader_message_len = vsnprintf(lastheader, 512, fmt, args); -+ va_end(args); -+ } -+ -+ if (clearbar) -+ toi_update_status(0, 1, NULL); -+ -+ if (ui_helper_data.pid == -1) -+ printk(KERN_EMERG "%s\n", lastheader); -+ else -+ toi_message(0, TOI_STATUS, 1, lastheader, NULL); -+} -+ -+/** -+ * toi_wait_for_keypress - Wait for keypress via userui. -+ * -+ * @timeout: Maximum time to wait. -+ * -+ * Wait for a keypress from userui. -+ * -+ * FIXME: Implement timeout? -+ */ -+static char userui_wait_for_keypress(int timeout) -+{ -+ char key = '\0'; -+ -+ if (ui_helper_data.pid != -1) { -+ wait_for_key_via_userui(); -+ key = ' '; -+ } -+ -+ return key; -+} -+ -+/** -+ * userui_abort_hibernate - Abort a cycle & tell user if they didn't request it. -+ * -+ * @result_code: Reason why we're aborting (1 << bit). -+ * @fmt: Message to display if telling the user what's going on. -+ * -+ * Abort a cycle. If this wasn't at the user's request (and we're displaying -+ * output), tell the user why and wait for them to acknowledge the message. -+ */ -+static void userui_abort_hibernate(int result_code, const char *fmt, ...) -+{ -+ va_list args; -+ int printed_len = 0; -+ -+ set_result_state(result_code); -+ -+ if (test_result_state(TOI_ABORTED)) -+ return; -+ -+ set_result_state(TOI_ABORTED); -+ -+ if (test_result_state(TOI_ABORT_REQUESTED)) -+ return; -+ -+ va_start(args, fmt); -+ printed_len = vsnprintf(local_printf_buf, sizeof(local_printf_buf), -+ fmt, args); -+ va_end(args); -+ if (ui_helper_data.pid != -1) -+ printed_len = sprintf(local_printf_buf + printed_len, -+ " (Press SPACE to continue)"); -+ -+ toi_prepare_status(CLEAR_BAR, "%s", local_printf_buf); -+ -+ if (ui_helper_data.pid != -1) -+ userui_wait_for_keypress(0); -+} -+ -+/** -+ * request_abort_hibernate - Abort hibernating or resuming at user request. -+ * -+ * Handle the user requesting the cancellation of a hibernation or resume by -+ * pressing escape. -+ */ -+static void request_abort_hibernate(void) -+{ -+ if (test_result_state(TOI_ABORT_REQUESTED) || -+ !test_action_state(TOI_CAN_CANCEL)) -+ return; -+ -+ if (test_toi_state(TOI_NOW_RESUMING)) { -+ toi_prepare_status(CLEAR_BAR, "Escape pressed. " -+ "Powering down again."); -+ set_toi_state(TOI_STOP_RESUME); -+ while (!test_toi_state(TOI_IO_STOPPED)) -+ schedule(); -+ if (toiActiveAllocator->mark_resume_attempted) -+ toiActiveAllocator->mark_resume_attempted(0); -+ toi_power_down(); -+ } -+ -+ toi_prepare_status(CLEAR_BAR, "--- ESCAPE PRESSED :" -+ " ABORTING HIBERNATION ---"); -+ set_abort_result(TOI_ABORT_REQUESTED); -+ wake_up_interruptible(&userui_wait_for_key); -+} -+ -+/** -+ * userui_user_rcv_msg - Receive a netlink message from userui. -+ * -+ * @skb: skb received. -+ * @nlh: Netlink header received. -+ */ -+static int userui_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh) -+{ -+ int type; -+ int *data; -+ -+ type = nlh->nlmsg_type; -+ -+ /* A control message: ignore them */ -+ if (type < NETLINK_MSG_BASE) -+ return 0; -+ -+ /* Unknown message: reply with EINVAL */ -+ if (type >= USERUI_MSG_MAX) -+ return -EINVAL; -+ -+ /* All operations require privileges, even GET */ -+ if (security_netlink_recv(skb, CAP_NET_ADMIN)) -+ return -EPERM; -+ -+ /* Only allow one task to receive NOFREEZE privileges */ -+ if (type == NETLINK_MSG_NOFREEZE_ME && ui_helper_data.pid != -1) { -+ printk(KERN_INFO "Got NOFREEZE_ME request when " -+ "ui_helper_data.pid is %d.\n", ui_helper_data.pid); -+ return -EBUSY; -+ } -+ -+ data = (int *) NLMSG_DATA(nlh); -+ -+ switch (type) { -+ case USERUI_MSG_ABORT: -+ request_abort_hibernate(); -+ return 0; -+ case USERUI_MSG_GET_STATE: -+ toi_send_netlink_message(&ui_helper_data, -+ USERUI_MSG_GET_STATE, &toi_bkd.toi_action, -+ sizeof(toi_bkd.toi_action)); -+ return 0; -+ case USERUI_MSG_GET_DEBUG_STATE: -+ toi_send_netlink_message(&ui_helper_data, -+ USERUI_MSG_GET_DEBUG_STATE, -+ &toi_bkd.toi_debug_state, -+ sizeof(toi_bkd.toi_debug_state)); -+ return 0; -+ case USERUI_MSG_SET_STATE: -+ if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int))) -+ return -EINVAL; -+ ui_nl_set_state(*data); -+ return 0; -+ case USERUI_MSG_SET_DEBUG_STATE: -+ if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int))) -+ return -EINVAL; -+ toi_bkd.toi_debug_state = (*data); -+ return 0; -+ case USERUI_MSG_SPACE: -+ wake_up_interruptible(&userui_wait_for_key); -+ return 0; -+ case USERUI_MSG_GET_POWERDOWN_METHOD: -+ toi_send_netlink_message(&ui_helper_data, -+ USERUI_MSG_GET_POWERDOWN_METHOD, -+ &toi_poweroff_method, -+ sizeof(toi_poweroff_method)); -+ return 0; -+ case USERUI_MSG_SET_POWERDOWN_METHOD: -+ if (nlh->nlmsg_len != NLMSG_LENGTH(sizeof(char))) -+ return -EINVAL; -+ toi_poweroff_method = (unsigned long)(*data); -+ return 0; -+ case USERUI_MSG_GET_LOGLEVEL: -+ toi_send_netlink_message(&ui_helper_data, -+ USERUI_MSG_GET_LOGLEVEL, -+ &toi_bkd.toi_default_console_level, -+ sizeof(toi_bkd.toi_default_console_level)); -+ return 0; -+ case USERUI_MSG_SET_LOGLEVEL: -+ if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int))) -+ return -EINVAL; -+ toi_bkd.toi_default_console_level = (*data); -+ return 0; -+ case USERUI_MSG_PRINTK: -+ printk(KERN_INFO "%s", (char *) data); -+ return 0; -+ } -+ -+ /* Unhandled here */ -+ return 1; -+} -+ -+/** -+ * userui_cond_pause - Possibly pause at user request. -+ * -+ * @pause: Whether to pause or just display the message. -+ * @message: Message to display at the start of pausing. -+ * -+ * Potentially pause and wait for the user to tell us to continue. We normally -+ * only pause when @pause is set. While paused, the user can do things like -+ * changing the loglevel, toggling the display of debugging sections and such -+ * like. -+ */ -+static void userui_cond_pause(int pause, char *message) -+{ -+ int displayed_message = 0, last_key = 0; -+ -+ while (last_key != 32 && -+ ui_helper_data.pid != -1 && -+ ((test_action_state(TOI_PAUSE) && pause) || -+ (test_action_state(TOI_SINGLESTEP)))) { -+ if (!displayed_message) { -+ toi_prepare_status(DONT_CLEAR_BAR, -+ "%s Press SPACE to continue.%s", -+ message ? message : "", -+ (test_action_state(TOI_SINGLESTEP)) ? -+ " Single step on." : ""); -+ displayed_message = 1; -+ } -+ last_key = userui_wait_for_keypress(0); -+ } -+ schedule(); -+} -+ -+/** -+ * userui_prepare_console - Prepare the console for use. -+ * -+ * Prepare a console for use, saving current kmsg settings and attempting to -+ * start userui. Console loglevel changes are handled by userui. -+ */ -+static void userui_prepare_console(void) -+{ -+ orig_kmsg = vt_kmsg_redirect(fg_console + 1); -+ -+ ui_helper_data.pid = -1; -+ -+ if (!userui_ops.enabled) { -+ printk(KERN_INFO "TuxOnIce: Userui disabled.\n"); -+ return; -+ } -+ -+ if (*ui_helper_data.program) -+ toi_netlink_setup(&ui_helper_data); -+ else -+ printk(KERN_INFO "TuxOnIce: Userui program not configured.\n"); -+} -+ -+/** -+ * userui_cleanup_console - Cleanup after a cycle. -+ * -+ * Tell userui to cleanup, and restore kmsg_redirect to its original value. -+ */ -+ -+static void userui_cleanup_console(void) -+{ -+ if (ui_helper_data.pid > -1) -+ toi_netlink_close(&ui_helper_data); -+ -+ vt_kmsg_redirect(orig_kmsg); -+} -+ -+/* -+ * User interface specific /sys/power/tuxonice entries. -+ */ -+ -+static struct toi_sysfs_data sysfs_params[] = { -+#if defined(CONFIG_NET) && defined(CONFIG_SYSFS) -+ SYSFS_BIT("enable_escape", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_CAN_CANCEL, 0), -+ SYSFS_BIT("pause_between_steps", SYSFS_RW, &toi_bkd.toi_action, -+ TOI_PAUSE, 0), -+ SYSFS_INT("enabled", SYSFS_RW, &userui_ops.enabled, 0, 1, 0, NULL), -+ SYSFS_INT("progress_granularity", SYSFS_RW, &progress_granularity, 1, -+ 2048, 0, NULL), -+ SYSFS_STRING("program", SYSFS_RW, ui_helper_data.program, 255, 0, -+ set_ui_program_set), -+ SYSFS_INT("debug", SYSFS_RW, &ui_helper_data.debug, 0, 1, 0, NULL) -+#endif -+}; -+ -+static struct toi_module_ops userui_ops = { -+ .type = MISC_MODULE, -+ .name = "userui", -+ .shared_directory = "user_interface", -+ .module = THIS_MODULE, -+ .storage_needed = userui_storage_needed, -+ .save_config_info = userui_save_config_info, -+ .load_config_info = userui_load_config_info, -+ .memory_needed = userui_memory_needed, -+ .post_atomic_restore = userui_post_atomic_restore, -+ .sysfs_data = sysfs_params, -+ .num_sysfs_entries = sizeof(sysfs_params) / -+ sizeof(struct toi_sysfs_data), -+}; -+ -+static struct ui_ops my_ui_ops = { -+ .update_status = userui_update_status, -+ .message = userui_message, -+ .prepare_status = userui_prepare_status, -+ .abort = userui_abort_hibernate, -+ .cond_pause = userui_cond_pause, -+ .prepare = userui_prepare_console, -+ .cleanup = userui_cleanup_console, -+ .wait_for_key = userui_wait_for_keypress, -+}; -+ -+/** -+ * toi_user_ui_init - Boot time initialisation for user interface. -+ * -+ * Invoked from the core init routine. -+ */ -+static __init int toi_user_ui_init(void) -+{ -+ int result; -+ -+ ui_helper_data.nl = NULL; -+ strncpy(ui_helper_data.program, CONFIG_TOI_USERUI_DEFAULT_PATH, 255); -+ ui_helper_data.pid = -1; -+ ui_helper_data.skb_size = sizeof(struct userui_msg_params); -+ ui_helper_data.pool_limit = 6; -+ ui_helper_data.netlink_id = NETLINK_TOI_USERUI; -+ ui_helper_data.name = "userspace ui"; -+ ui_helper_data.rcv_msg = userui_user_rcv_msg; -+ ui_helper_data.interface_version = 8; -+ ui_helper_data.must_init = 0; -+ ui_helper_data.not_ready = userui_cleanup_console; -+ init_completion(&ui_helper_data.wait_for_process); -+ result = toi_register_module(&userui_ops); -+ if (!result) -+ result = toi_register_ui_ops(&my_ui_ops); -+ if (result) -+ toi_unregister_module(&userui_ops); -+ -+ return result; -+} -+ -+#ifdef MODULE -+/** -+ * toi_user_ui_ext - Cleanup code for if the core is unloaded. -+ */ -+static __exit void toi_user_ui_exit(void) -+{ -+ toi_netlink_close_complete(&ui_helper_data); -+ toi_remove_ui_ops(&my_ui_ops); -+ toi_unregister_module(&userui_ops); -+} -+ -+module_init(toi_user_ui_init); -+module_exit(toi_user_ui_exit); -+MODULE_AUTHOR("Nigel Cunningham"); -+MODULE_DESCRIPTION("TuxOnIce Userui Support"); -+MODULE_LICENSE("GPL"); -+#else -+late_initcall(toi_user_ui_init); -+#endif -diff --git a/kernel/power/user.c b/kernel/power/user.c -index bf0014d..d1c4ac2 100644 ---- a/kernel/power/user.c -+++ b/kernel/power/user.c -@@ -64,6 +64,7 @@ static struct snapshot_data { - } snapshot_state; - - atomic_t snapshot_device_available = ATOMIC_INIT(1); -+EXPORT_SYMBOL_GPL(snapshot_device_available); - - static int snapshot_open(struct inode *inode, struct file *filp) - { -diff --git a/kernel/printk.c b/kernel/printk.c -index 1751c45..b7257e3 100644 ---- a/kernel/printk.c -+++ b/kernel/printk.c -@@ -32,6 +32,7 @@ - #include - #include - #include -+#include - #include - #include - #include -@@ -68,6 +69,7 @@ int console_printk[4] = { - MINIMUM_CONSOLE_LOGLEVEL, /* minimum_console_loglevel */ - DEFAULT_CONSOLE_LOGLEVEL, /* default_console_loglevel */ - }; -+EXPORT_SYMBOL_GPL(console_printk); - - static int saved_console_loglevel = -1; - -@@ -956,6 +958,7 @@ void suspend_console(void) - console_suspended = 1; - up(&console_sem); - } -+EXPORT_SYMBOL_GPL(suspend_console); - - void resume_console(void) - { -@@ -965,6 +968,7 @@ void resume_console(void) - console_suspended = 0; - release_console_sem(); - } -+EXPORT_SYMBOL_GPL(resume_console); - - /** - * acquire_console_sem - lock the console system for exclusive use. -diff --git a/mm/bootmem.c b/mm/bootmem.c -index 7d14868..e01836f 100644 ---- a/mm/bootmem.c -+++ b/mm/bootmem.c -@@ -23,6 +23,7 @@ - unsigned long max_low_pfn; - unsigned long min_low_pfn; - unsigned long max_pfn; -+EXPORT_SYMBOL_GPL(max_pfn); - - #ifdef CONFIG_CRASH_DUMP - /* -diff --git a/mm/highmem.c b/mm/highmem.c -index 9c1e627..b0facc3 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -57,6 +57,7 @@ unsigned int nr_free_highpages (void) - - return pages; - } -+EXPORT_SYMBOL_GPL(nr_free_highpages); - - static int pkmap_count[LAST_PKMAP]; - static unsigned int last_pkmap_nr; -diff --git a/mm/memory.c b/mm/memory.c -index 09e4b1b..fe93399 100644 ---- a/mm/memory.c -+++ b/mm/memory.c -@@ -1243,6 +1243,7 @@ no_page_table: - return ERR_PTR(-EFAULT); - return page; - } -+EXPORT_SYMBOL_GPL(follow_page); - - int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, - unsigned long start, int nr_pages, unsigned int gup_flags, -diff --git a/mm/mmzone.c b/mm/mmzone.c -index f5b7d17..72a6770 100644 ---- a/mm/mmzone.c -+++ b/mm/mmzone.c -@@ -14,6 +14,7 @@ struct pglist_data *first_online_pgdat(void) - { - return NODE_DATA(first_online_node); - } -+EXPORT_SYMBOL_GPL(first_online_pgdat); - - struct pglist_data *next_online_pgdat(struct pglist_data *pgdat) - { -@@ -23,6 +24,7 @@ struct pglist_data *next_online_pgdat(struct pglist_data *pgdat) - return NULL; - return NODE_DATA(nid); - } -+EXPORT_SYMBOL_GPL(next_online_pgdat); - - /* - * next_zone - helper magic for for_each_zone() -@@ -42,6 +44,7 @@ struct zone *next_zone(struct zone *zone) - } - return zone; - } -+EXPORT_SYMBOL_GPL(next_zone); - - static inline int zref_in_nodemask(struct zoneref *zref, nodemask_t *nodes) - { -diff --git a/mm/page-writeback.c b/mm/page-writeback.c -index 0b19943..4d31aa0 100644 ---- a/mm/page-writeback.c -+++ b/mm/page-writeback.c -@@ -99,6 +99,7 @@ unsigned int dirty_expire_interval = 30 * 100; /* centiseconds */ - * Flag that makes the machine dump writes/reads and block dirtyings. - */ - int block_dump; -+EXPORT_SYMBOL_GPL(block_dump); - - /* - * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies: -diff --git a/mm/page_alloc.c b/mm/page_alloc.c -index 8deb9d0..2ffc3f4 100644 ---- a/mm/page_alloc.c -+++ b/mm/page_alloc.c -@@ -2106,6 +2106,26 @@ static unsigned int nr_free_zone_pages(int offset) - return sum; - } - -+static unsigned int nr_unallocated_zone_pages(int offset) -+{ -+ struct zoneref *z; -+ struct zone *zone; -+ -+ /* Just pick one node, since fallback list is circular */ -+ unsigned int sum = 0; -+ -+ struct zonelist *zonelist = node_zonelist(numa_node_id(), GFP_KERNEL); -+ -+ for_each_zone_zonelist(zone, z, zonelist, offset) { -+ unsigned long high = high_wmark_pages(zone); -+ unsigned long left = zone_page_state(zone, NR_FREE_PAGES); -+ if (left > high) -+ sum += left - high; -+ } -+ -+ return sum; -+} -+ - /* - * Amount of free RAM allocatable within ZONE_DMA and ZONE_NORMAL - */ -@@ -2116,6 +2136,15 @@ unsigned int nr_free_buffer_pages(void) - EXPORT_SYMBOL_GPL(nr_free_buffer_pages); - - /* -+ * Amount of free RAM allocatable within ZONE_DMA and ZONE_NORMAL -+ */ -+unsigned int nr_unallocated_buffer_pages(void) -+{ -+ return nr_unallocated_zone_pages(gfp_zone(GFP_USER)); -+} -+EXPORT_SYMBOL_GPL(nr_unallocated_buffer_pages); -+ -+/* - * Amount of free RAM allocatable within all zones - */ - unsigned int nr_free_pagecache_pages(void) -diff --git a/mm/shmem.c b/mm/shmem.c -index eef4ebe..1adeead 100644 ---- a/mm/shmem.c -+++ b/mm/shmem.c -@@ -1568,6 +1568,8 @@ static struct inode *shmem_get_inode(struct super_block *sb, int mode, - memset(info, 0, (char *)inode - (char *)info); - spin_lock_init(&info->lock); - info->flags = flags & VM_NORESERVE; -+ if (flags & VM_ATOMIC_COPY) -+ inode->i_flags |= S_ATOMIC_COPY; - INIT_LIST_HEAD(&info->swaplist); - cache_no_acl(inode); - -diff --git a/mm/swap_state.c b/mm/swap_state.c -index 6d1daeb..eced4ef 100644 ---- a/mm/swap_state.c -+++ b/mm/swap_state.c -@@ -46,6 +46,7 @@ struct address_space swapper_space = { - .i_mmap_nonlinear = LIST_HEAD_INIT(swapper_space.i_mmap_nonlinear), - .backing_dev_info = &swap_backing_dev_info, - }; -+EXPORT_SYMBOL_GPL(swapper_space); - - #define INC_CACHE_INFO(x) do { swap_cache_info.x++; } while (0) - -diff --git a/mm/swapfile.c b/mm/swapfile.c -index 6c0585b..9c563b5 100644 ---- a/mm/swapfile.c -+++ b/mm/swapfile.c -@@ -39,7 +39,6 @@ - static bool swap_count_continued(struct swap_info_struct *, pgoff_t, - unsigned char); - static void free_swap_count_continuations(struct swap_info_struct *); --static sector_t map_swap_entry(swp_entry_t, struct block_device**); - - static DEFINE_SPINLOCK(swap_lock); - static unsigned int nr_swapfiles; -@@ -477,6 +476,7 @@ noswap: - spin_unlock(&swap_lock); - return (swp_entry_t) {0}; - } -+EXPORT_SYMBOL_GPL(get_swap_page); - - /* The only caller of this function is now susupend routine */ - swp_entry_t get_swap_page_of_type(int type) -@@ -499,6 +499,7 @@ swp_entry_t get_swap_page_of_type(int type) - spin_unlock(&swap_lock); - return (swp_entry_t) {0}; - } -+EXPORT_SYMBOL_GPL(get_swap_page_of_type); - - static struct swap_info_struct *swap_info_get(swp_entry_t entry) - { -@@ -619,6 +620,7 @@ void swapcache_free(swp_entry_t entry, struct page *page) - spin_unlock(&swap_lock); - } - } -+EXPORT_SYMBOL_GPL(swap_free); - - /* - * How many references to page are currently swapped out? -@@ -1263,7 +1265,7 @@ static void drain_mmlist(void) - * Note that the type of this function is sector_t, but it returns page offset - * into the bdev, not sector offset. - */ --static sector_t map_swap_entry(swp_entry_t entry, struct block_device **bdev) -+sector_t map_swap_entry(swp_entry_t entry, struct block_device **bdev) - { - struct swap_info_struct *sis; - struct swap_extent *start_se; -@@ -1290,6 +1292,7 @@ static sector_t map_swap_entry(swp_entry_t entry, struct block_device **bdev) - BUG_ON(se == start_se); /* It *must* be present */ - } - } -+EXPORT_SYMBOL_GPL(map_swap_entry); - - /* - * Returns the page offset into bdev for the specified page's swap entry. -@@ -1632,6 +1635,7 @@ out_dput: - out: - return err; - } -+EXPORT_SYMBOL_GPL(sys_swapoff); - - #ifdef CONFIG_PROC_FS - /* iterator */ -@@ -2055,6 +2059,7 @@ out: - } - return error; - } -+EXPORT_SYMBOL_GPL(sys_swapon); - - void si_swapinfo(struct sysinfo *val) - { -@@ -2072,6 +2077,7 @@ void si_swapinfo(struct sysinfo *val) - val->totalswap = total_swap_pages + nr_to_be_unused; - spin_unlock(&swap_lock); - } -+EXPORT_SYMBOL_GPL(si_swapinfo); - - /* - * Verify that a swap entry is valid and increment its swap map count. -@@ -2179,6 +2185,13 @@ int swapcache_prepare(swp_entry_t entry) - return __swap_duplicate(entry, SWAP_HAS_CACHE); - } - -+ -+struct swap_info_struct *get_swap_info_struct(unsigned type) -+{ -+ return swap_info[type]; -+} -+EXPORT_SYMBOL_GPL(get_swap_info_struct); -+ - /* - * swap_lock prevents swap_map being freed. Don't grab an extra - * reference on the swaphandle, it doesn't matter if it becomes unused. -diff --git a/mm/vmscan.c b/mm/vmscan.c -index c26986c..ac2a07d 100644 ---- a/mm/vmscan.c -+++ b/mm/vmscan.c -@@ -2285,6 +2285,9 @@ void wakeup_kswapd(struct zone *zone, int order) - if (!populated_zone(zone)) - return; - -+ if (freezer_is_on()) -+ return; -+ - pgdat = zone->zone_pgdat; - if (zone_watermark_ok(zone, order, low_wmark_pages(zone), 0, 0)) - return; -@@ -2372,6 +2375,7 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim) - - return nr_reclaimed; - } -+EXPORT_SYMBOL_GPL(shrink_all_memory); - #endif /* CONFIG_HIBERNATION */ - - /* It's optimal to keep kswapds on the same CPUs as their memory, but