123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482 |
- <!-- Creator : groff version 1.22.4 -->
- <!-- CreationDate: Tue Jul 18 07:11:06 2023 -->
- <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
- "http://www.w3.org/TR/html4/loose.dtd">
- <html>
- <head>
- <meta name="generator" content="groff -Thtml, see www.gnu.org">
- <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
- <meta name="Content-Style" content="text/css">
- <style type="text/css">
- p { margin-top: 0; margin-bottom: 0; vertical-align: top }
- pre { margin-top: 0; margin-bottom: 0; vertical-align: top }
- table { margin-top: 0; margin-bottom: 0; vertical-align: top }
- h1 { text-align: center }
- </style>
- <title></title>
- </head>
- <body>
- <hr>
- <p>CPIO(5) BSD File Formats Manual CPIO(5)</p>
- <p style="margin-top: 1em"><b>NAME</b></p>
- <p style="margin-left:6%;"><b>cpio</b> — format of
- cpio archive files</p>
- <p style="margin-top: 1em"><b>DESCRIPTION</b></p>
- <p style="margin-left:6%;">The <b>cpio</b> archive format
- collects any number of files, directories, and other file
- system objects (symbolic links, device nodes, etc.) into a
- single stream of bytes.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>General
- Format</b> <br>
- Each file system object in a <b>cpio</b> archive comprises a
- header record with basic numeric metadata followed by the
- full pathname of the entry and the file data. The header
- record stores a series of integer values that generally
- follow the fields in <i>struct stat</i>. (See stat(2) for
- details.) The variants differ primarily in how they store
- those integers (binary, octal, or hexadecimal). The header
- is followed by the pathname of the entry (the length of the
- pathname is stored in the header) and any file data. The end
- of the archive is indicated by a special record with the
- pathname “TRAILER!!!”.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>PWB
- format</b> <br>
- The PWB binary <b>cpio</b> format is the original format,
- when cpio was introduced as part of the Programmer’s
- Work Bench system, a variant of 6th Edition UNIX. It stores
- numbers as 2-byte and 4-byte binary values. Each entry
- begins with a header in the following format:</p>
- <p style="margin-left:14%; margin-top: 1em">struct
- header_pwb_cpio { <br>
- short h_magic; <br>
- short h_dev; <br>
- short h_ino; <br>
- short h_mode; <br>
- short h_uid; <br>
- short h_gid; <br>
- short h_nlink; <br>
- short h_majmin; <br>
- long h_mtime; <br>
- short h_namesize; <br>
- long h_filesize; <br>
- };</p>
- <p style="margin-left:6%; margin-top: 1em">The <i>short</i>
- fields here are 16-bit integer values, while the <i>long</i>
- fields are 32 bit integers. Since PWB UNIX, like the 6th
- Edition UNIX it was based on, only ran on PDP-11 computers,
- they are in PDP-endian format, which has little-endian
- shorts, and big-endian longs. That is, the long integer
- whose hexadecimal representation is 0x12345678 would be
- stored in four successive bytes as 0x34, 0x12, 0x78, 0x56.
- The fields are as follows:</p>
- <p style="margin-top: 1em"><i>h_magic</i></p>
- <p style="margin-left:17%;">The integer value octal
- 070707.</p>
- <p style="margin-top: 1em"><i>h_dev</i>, <i>h_ino</i></p>
- <p style="margin-left:17%;">The device and inode numbers
- from the disk. These are used by programs that read
- <b>cpio</b> archives to determine when two entries refer to
- the same file. Programs that synthesize <b>cpio</b> archives
- should be careful to set these to distinct values for each
- entry.</p>
- <p style="margin-top: 1em"><i>h_mode</i></p>
- <p style="margin-left:17%; margin-top: 1em">The mode
- specifies both the regular permissions and the file type,
- and it also holds a couple of bits that are irrelevant to
- the cpio format, because the field is actually a raw copy of
- the mode field in the inode representing the file. These are
- the IALLOC flag, which shows that the inode entry is in use,
- and the ILARG flag, which shows that the file it represents
- is large enough to have indirect blocks pointers in the
- inode. The mode is decoded as follows:</p>
- <p style="margin-top: 1em">0100000</p>
- <p style="margin-left:28%; margin-top: 1em">IALLOC flag -
- irrelevant to cpio.</p>
- <p>0060000</p>
- <p style="margin-left:28%; margin-top: 1em">This masks the
- file type bits.</p>
- <p>0040000</p>
- <p style="margin-left:28%; margin-top: 1em">File type value
- for directories.</p>
- <p>0020000</p>
- <p style="margin-left:28%; margin-top: 1em">File type value
- for character special devices.</p>
- <p>0060000</p>
- <p style="margin-left:28%; margin-top: 1em">File type value
- for block special devices.</p>
- <p>0010000</p>
- <p style="margin-left:28%; margin-top: 1em">ILARG flag -
- irrelevant to cpio.</p>
- <p>0004000</p>
- <p style="margin-left:28%; margin-top: 1em">SUID bit.</p>
- <p>0002000</p>
- <p style="margin-left:28%; margin-top: 1em">SGID bit.</p>
- <p>0001000</p>
- <p style="margin-left:28%; margin-top: 1em">Sticky bit.</p>
- <p>0000777</p>
- <p style="margin-left:28%; margin-top: 1em">The lower 9
- bits specify read/write/execute permissions for world,
- group, and user following standard POSIX conventions.</p>
- <p style="margin-top: 1em"><i>h_uid</i>, <i>h_gid</i></p>
- <p style="margin-left:17%;">The numeric user id and group
- id of the owner.</p>
- <p style="margin-top: 1em"><i>h_nlink</i></p>
- <p style="margin-left:17%;">The number of links to this
- file. Directories always have a value of at least two here.
- Note that hardlinked files include file data with every copy
- in the archive.</p>
- <p style="margin-top: 1em"><i>h_majmin</i></p>
- <p style="margin-left:17%;">For block special and character
- special entries, this field contains the associated device
- number, with the major number in the high byte, and the
- minor number in the low byte. For all other entry types, it
- should be set to zero by writers and ignored by readers.</p>
- <p style="margin-top: 1em"><i>h_mtime</i></p>
- <p style="margin-left:17%;">Modification time of the file,
- indicated as the number of seconds since the start of the
- epoch, 00:00:00 UTC January 1, 1970.</p>
- <p style="margin-top: 1em"><i>h_namesize</i></p>
- <p style="margin-left:17%;">The number of bytes in the
- pathname that follows the header. This count includes the
- trailing NUL byte.</p>
- <p style="margin-top: 1em"><i>h_filesize</i></p>
- <p style="margin-left:17%;">The size of the file. Note that
- this archive format is limited to 16 megabyte file sizes,
- because PWB UNIX, like 6th Edition, only used an unsigned 24
- bit integer for the file size internally.</p>
- <p style="margin-left:6%; margin-top: 1em">The pathname
- immediately follows the fixed header. If <b>h_namesize</b>
- is odd, an additional NUL byte is added after the pathname.
- The file data is then appended, again with an additional NUL
- appended if needed to get the next header at an even
- offset.</p>
- <p style="margin-left:6%; margin-top: 1em">Hardlinked files
- are not given special treatment; the full file contents are
- included with each copy of the file.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>New Binary
- Format</b> <br>
- The new binary <b>cpio</b> format showed up when cpio was
- adopted into late 7th Edition UNIX. It is exactly like the
- PWB binary format, described above, except for three
- changes:</p>
- <p style="margin-left:6%; margin-top: 1em">First, UNIX now
- ran on more than one hardware type, so the endianness of 16
- bit integers must be determined by observing the magic
- number at the start of the header. The 32 bit integers are
- still always stored with the most significant word first,
- though, so each of those two, in the struct shown above, was
- stored as an array of two 16 bit integers, in the
- traditional order. Those 16 bit integers, like all the
- others in the struct, were accessed using a macro that byte
- swapped them if necessary.</p>
- <p style="margin-left:6%; margin-top: 1em">Next, 7th
- Edition had more file types to store, and the IALLOC and
- ILARG flag bits were re-purposed to accommodate these. The
- revised use of the various bits is as follows:</p>
- <p style="margin-top: 1em">0170000</p>
- <p style="margin-left:18%; margin-top: 1em">This masks the
- file type bits.</p>
- <p>0140000</p>
- <p style="margin-left:18%; margin-top: 1em">File type value
- for sockets.</p>
- <p>0120000</p>
- <p style="margin-left:18%; margin-top: 1em">File type value
- for symbolic links. For symbolic links, the link body is
- stored as file data.</p>
- <p>0100000</p>
- <p style="margin-left:18%; margin-top: 1em">File type value
- for regular files.</p>
- <p>0060000</p>
- <p style="margin-left:18%; margin-top: 1em">File type value
- for block special devices.</p>
- <p>0040000</p>
- <p style="margin-left:18%; margin-top: 1em">File type value
- for directories.</p>
- <p>0020000</p>
- <p style="margin-left:18%; margin-top: 1em">File type value
- for character special devices.</p>
- <p>0010000</p>
- <p style="margin-left:18%; margin-top: 1em">File type value
- for named pipes or FIFOs.</p>
- <p>0004000</p>
- <p style="margin-left:18%; margin-top: 1em">SUID bit.</p>
- <p>0002000</p>
- <p style="margin-left:18%; margin-top: 1em">SGID bit.</p>
- <p>0001000</p>
- <p style="margin-left:18%; margin-top: 1em">Sticky bit.</p>
- <p>0000777</p>
- <p style="margin-left:18%; margin-top: 1em">The lower 9
- bits specify read/write/execute permissions for world,
- group, and user following standard POSIX conventions.</p>
- <p style="margin-left:6%; margin-top: 1em">Finally, the
- file size field now represents a signed 32 bit integer in
- the underlying file system, so the maximum file size has
- increased to 2 gigabytes.</p>
- <p style="margin-left:6%; margin-top: 1em">Note that there
- is no obvious way to tell which of the two binary formats an
- archive uses, other than to see which one makes more sense.
- The typical error scenario is that a PWB format archive
- unpacked as if it were in the new format will create named
- sockets instead of directories, and then fail to unpack
- files that should go in those directories. Running
- <i>bsdcpio -itv</i> on an unknown archive will make it
- obvious which it is: if it’s PWB format, directories
- will be listed with an ’s’ instead of a
- ’d’ as the first character of the mode string,
- and the larger files will have a ’?’ in that
- position.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Portable
- ASCII Format</b> <br>
- Version 2 of the Single UNIX Specification
- (“SUSv2”) standardized an ASCII variant that is
- portable across all platforms. It is commonly known as the
- “old character” format or as the
- “odc” format. It stores the same numeric fields
- as the old binary format, but represents them as 6-character
- or 11-character octal values.</p>
- <p style="margin-left:14%; margin-top: 1em">struct
- cpio_odc_header { <br>
- char c_magic[6]; <br>
- char c_dev[6]; <br>
- char c_ino[6]; <br>
- char c_mode[6]; <br>
- char c_uid[6]; <br>
- char c_gid[6]; <br>
- char c_nlink[6]; <br>
- char c_rdev[6]; <br>
- char c_mtime[11]; <br>
- char c_namesize[6]; <br>
- char c_filesize[11]; <br>
- };</p>
- <p style="margin-left:6%; margin-top: 1em">The fields are
- identical to those in the new binary format. The name and
- file body follow the fixed header. Unlike the binary
- formats, there is no additional padding after the pathname
- or file contents. If the files being archived are themselves
- entirely ASCII, then the resulting archive will be entirely
- ASCII, except for the NUL byte that terminates the name
- field.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>New ASCII
- Format</b> <br>
- The "new" ASCII format uses 8-byte hexadecimal
- fields for all numbers and separates device numbers into
- separate fields for major and minor numbers.</p>
- <p style="margin-left:14%; margin-top: 1em">struct
- cpio_newc_header { <br>
- char c_magic[6]; <br>
- char c_ino[8]; <br>
- char c_mode[8]; <br>
- char c_uid[8]; <br>
- char c_gid[8]; <br>
- char c_nlink[8]; <br>
- char c_mtime[8]; <br>
- char c_filesize[8]; <br>
- char c_devmajor[8]; <br>
- char c_devminor[8]; <br>
- char c_rdevmajor[8]; <br>
- char c_rdevminor[8]; <br>
- char c_namesize[8]; <br>
- char c_check[8]; <br>
- };</p>
- <p style="margin-left:6%; margin-top: 1em">Except as
- specified below, the fields here match those specified for
- the new binary format above.</p>
- <p style="margin-top: 1em"><i>magic</i></p>
- <p style="margin-left:17%; margin-top: 1em">The string
- “070701”.</p>
- <p style="margin-top: 1em"><i>check</i></p>
- <p style="margin-left:17%; margin-top: 1em">This field is
- always set to zero by writers and ignored by readers. See
- the next section for more details.</p>
- <p style="margin-left:6%; margin-top: 1em">The pathname is
- followed by NUL bytes so that the total size of the fixed
- header plus pathname is a multiple of four. Likewise, the
- file data is padded to a multiple of four bytes. Note that
- this format supports only 4 gigabyte files (unlike the older
- ASCII format, which supports 8 gigabyte files).</p>
- <p style="margin-left:6%; margin-top: 1em">In this format,
- hardlinked files are handled by setting the filesize to zero
- for each entry except the first one that appears in the
- archive.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>New CRC
- Format</b> <br>
- The CRC format is identical to the new ASCII format
- described in the previous section except that the magic
- field is set to “070702” and the <i>check</i>
- field is set to the sum of all bytes in the file data. This
- sum is computed treating all bytes as unsigned values and
- using unsigned arithmetic. Only the least-significant 32
- bits of the sum are stored.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>HP
- variants</b> <br>
- The <b>cpio</b> implementation distributed with HPUX used
- XXXX but stored device numbers differently XXX.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Other
- Extensions and Variants</b> <br>
- Sun Solaris uses additional file types to store extended
- file data, including ACLs and extended attributes, as
- special entries in cpio archives.</p>
- <p style="margin-left:6%; margin-top: 1em">XXX Others?
- XXX</p>
- <p style="margin-top: 1em"><b>SEE ALSO</b></p>
- <p style="margin-left:6%;">cpio(1), tar(5)</p>
- <p style="margin-top: 1em"><b>STANDARDS</b></p>
- <p style="margin-left:6%;">The <b>cpio</b> utility is no
- longer a part of POSIX or the Single Unix Standard. It last
- appeared in Version 2 of the Single UNIX Specification
- (“SUSv2”). It has been supplanted in subsequent
- standards by pax(1). The portable ASCII format is currently
- part of the specification for the pax(1) utility.</p>
- <p style="margin-top: 1em"><b>HISTORY</b></p>
- <p style="margin-left:6%;">The original cpio utility was
- written by Dick Haight while working in AT&T’s
- Unix Support Group. It appeared in 1977 as part of PWB/UNIX
- 1.0, the “Programmer’s Work Bench” derived
- from Version 6 AT&T UNIX that was used internally
- at AT&T. Both the new binary and old character formats
- were in use by 1980, according to the System III source
- released by SCO under their “Ancient Unix”
- license. The character format was adopted as part of IEEE
- Std 1003.1-1988 (“POSIX.1”). XXX when did
- "newc" appear? Who invented it? When did HP come
- out with their variant? When did Sun introduce ACLs and
- extended attributes? XXX</p>
- <p style="margin-top: 1em"><b>BUGS</b></p>
- <p style="margin-left:6%;">The “CRC” format is
- mis-named, as it uses a simple checksum and not a cyclic
- redundancy check.</p>
- <p style="margin-left:6%; margin-top: 1em">The binary
- formats are limited to 16 bits for user id, group id,
- device, and inode numbers. They are limited to 16 megabyte
- and 2 gigabyte file sizes for the older and newer variants,
- respectively.</p>
- <p style="margin-left:6%; margin-top: 1em">The old ASCII
- format is limited to 18 bits for the user id, group id,
- device, and inode numbers. It is limited to 8 gigabyte file
- sizes.</p>
- <p style="margin-left:6%; margin-top: 1em">The new ASCII
- format is limited to 4 gigabyte file sizes.</p>
- <p style="margin-left:6%; margin-top: 1em">None of the cpio
- formats store user or group names, which are essential when
- moving files between systems with dissimilar user or group
- numbering.</p>
- <p style="margin-left:6%; margin-top: 1em">Especially when
- writing older cpio variants, it may be necessary to map
- actual device/inode values to synthesized values that fit
- the available fields. With very large filesystems, this may
- be necessary even for the newer formats.</p>
- <p style="margin-left:6%; margin-top: 1em">BSD
- December 23, 2011 BSD</p>
- <hr>
- </body>
- </html>
|