123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503 |
- <!-- Creator : groff version 1.22.4 -->
- <!-- CreationDate: Tue Jul 18 07:11:06 2023 -->
- <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
- "http://www.w3.org/TR/html4/loose.dtd">
- <html>
- <head>
- <meta name="generator" content="groff -Thtml, see www.gnu.org">
- <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
- <meta name="Content-Style" content="text/css">
- <style type="text/css">
- p { margin-top: 0; margin-bottom: 0; vertical-align: top }
- pre { margin-top: 0; margin-bottom: 0; vertical-align: top }
- table { margin-top: 0; margin-bottom: 0; vertical-align: top }
- h1 { text-align: center }
- </style>
- <title></title>
- </head>
- <body>
- <hr>
- <p>LIBARCHIVE-FORMATS(5) BSD File Formats Manual
- LIBARCHIVE-FORMATS(5)</p>
- <p style="margin-top: 1em"><b>NAME</b></p>
- <p style="margin-left:6%;"><b>libarchive-formats</b>
- — archive formats supported by the libarchive
- library</p>
- <p style="margin-top: 1em"><b>DESCRIPTION</b></p>
- <p style="margin-left:6%;">The libarchive(3) library reads
- and writes a variety of streaming archive formats. Generally
- speaking, all of these archive formats consist of a series
- of “entries”. Each entry stores a single file
- system object, such as a file, directory, or symbolic
- link.</p>
- <p style="margin-left:6%; margin-top: 1em">The following
- provides a brief description of each format supported by
- libarchive, with some information about recognized
- extensions or limitations of the current library support.
- Note that just because a format is supported by libarchive
- does not imply that a program that uses libarchive will
- support that format. Applications that use libarchive
- specify which formats they wish to support, though many
- programs do use libarchive convenience functions to enable
- all supported formats.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Tar
- Formats</b> <br>
- The libarchive(3) library can read most tar archives. It can
- write POSIX-standard “ustar” and “pax
- interchange” formats as well as v7 tar format and a
- subset of the legacy GNU tar format.</p>
- <p style="margin-left:6%; margin-top: 1em">All tar formats
- store each entry in one or more 512-byte records. The first
- record is used for file metadata, including filename,
- timestamp, and mode information, and the file data is stored
- in subsequent records. Later variants have extended this by
- either appropriating undefined areas of the header record,
- extending the header to multiple records, or by storing
- special entries that modify the interpretation of subsequent
- entries.</p>
- <p style="margin-top: 1em"><b>gnutar</b></p>
- <p style="margin-left:17%; margin-top: 1em">The
- libarchive(3) library can read most GNU-format tar archives.
- It currently supports the most popular GNU extensions,
- including modern long filename and linkname support, as well
- as atime and ctime data. The libarchive library does not
- support multi-volume archives, nor the old GNU long filename
- format. It can read GNU sparse file entries, including the
- new POSIX-based formats.</p>
- <p style="margin-left:17%; margin-top: 1em">The
- libarchive(3) library can write GNU tar format, including
- long filename and linkname support, as well as atime and
- ctime data.</p>
- <p style="margin-top: 1em"><b>pax</b></p>
- <p style="margin-left:17%; margin-top: 1em">The
- libarchive(3) library can read and write POSIX-compliant pax
- interchange format archives. Pax interchange format archives
- are an extension of the older ustar format that adds a
- separate entry with additional attributes stored as
- key/value pairs immediately before each regular entry. The
- presence of these additional entries is the only difference
- between pax interchange format and the older ustar format.
- The extended attributes are of unlimited length and are
- stored as UTF-8 Unicode strings. Keywords defined in the
- standard are in all lowercase; vendors are allowed to define
- custom keys by preceding them with the vendor name in all
- uppercase. When writing pax archives, libarchive uses many
- of the SCHILY keys defined by Joerg Schilling’s
- “star” archiver and a few LIBARCHIVE keys. The
- libarchive library can read most of the SCHILY keys and most
- of the GNU keys introduced by GNU tar. It silently ignores
- any keywords that it does not understand.</p>
- <p style="margin-left:17%; margin-top: 1em">The pax
- interchange format converts filenames to Unicode and stores
- them using the UTF-8 encoding. Prior to libarchive 3.0,
- libarchive erroneously assumed that the system
- wide-character routines natively supported Unicode. This
- caused it to mis-handle non-ASCII filenames on systems that
- did not satisfy this assumption.</p>
- <p style="margin-top: 1em"><b>restricted pax</b></p>
- <p style="margin-left:17%;">The libarchive library can also
- write pax archives in which it attempts to suppress the
- extended attributes entry whenever possible. The result will
- be identical to a ustar archive unless the extended
- attributes entry is required to store a long file name, long
- linkname, extended ACL, file flags, or if any of the
- standard ustar data (user name, group name, UID, GID, etc)
- cannot be fully represented in the ustar header. In all
- cases, the result can be dearchived by any program that can
- read POSIX-compliant pax interchange format archives.
- Programs that correctly read ustar format (see below) will
- also be able to read this format; any extended attributes
- will be extracted as separate files stored in
- <i>PaxHeader</i> directories.</p>
- <p style="margin-top: 1em"><b>ustar</b></p>
- <p style="margin-left:17%; margin-top: 1em">The libarchive
- library can both read and write this format. This format has
- the following limitations:</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Device major and minor numbers
- are limited to 21 bits. Nodes with larger numbers will not
- be added to the archive.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Path names in the archive are
- limited to 255 bytes. (Shorter if there is no / character in
- exactly the right place.)</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Symbolic links and hard links
- are stored in the archive with the name of the referenced
- file. This name is limited to 100 bytes.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Extended attributes, file
- flags, and other extended security information cannot be
- stored.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Archive entries are limited to
- 8 gigabytes in size.</p>
- <p style="margin-left:17%;">Note that the pax interchange
- format has none of these restrictions. The ustar format is
- old and widely supported. It is recommended when
- compatibility is the primary concern.</p>
- <p style="margin-top: 1em"><b>v7</b></p>
- <p style="margin-left:17%; margin-top: 1em">The libarchive
- library can read and write the legacy v7 tar format. This
- format has the following limitations:</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Only regular files,
- directories, and symbolic links can be archived. Block and
- character device nodes, FIFOs, and sockets cannot be
- archived.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Path names in the archive are
- limited to 100 bytes.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Symbolic links and hard links
- are stored in the archive with the name of the referenced
- file. This name is limited to 100 bytes.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">User and group information are
- stored as numeric IDs; there is no provision for storing
- user or group names.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Extended attributes, file
- flags, and other extended security information cannot be
- stored.</p>
- <p><b>•</b></p>
- <p style="margin-left:22%;">Archive entries are limited to
- 8 gigabytes in size.</p>
- <p style="margin-left:17%;">Generally, users should prefer
- the ustar format for portability as the v7 tar format is
- both less useful and less portable.</p>
- <p style="margin-left:6%; margin-top: 1em">The libarchive
- library also reads a variety of commonly-used extensions to
- the basic tar format. These extensions are recognized
- automatically whenever they appear.</p>
- <p style="margin-top: 1em">Numeric extensions.</p>
- <p style="margin-left:17%;">The POSIX standards require
- fixed-length numeric fields to be written with some
- character position reserved for terminators. Libarchive
- allows these fields to be written without terminator
- characters. This extends the allowable range; in particular,
- ustar archives with this extension can support entries up to
- 64 gigabytes in size. Libarchive also recognizes base-256
- values in most numeric fields. This essentially removes all
- limitations on file size, modification time, and device
- numbers.</p>
- <p style="margin-top: 1em">Solaris extensions</p>
- <p style="margin-left:17%;">Libarchive recognizes ACL and
- extended attribute records written by Solaris tar.</p>
- <p style="margin-left:6%; margin-top: 1em">The first tar
- program appeared in Seventh Edition Unix in 1979. The first
- official standard for the tar file format was the
- “ustar” (Unix Standard Tar) format defined by
- POSIX in 1988. POSIX.1-2001 extended the ustar format to
- create the “pax interchange” format.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Cpio
- Formats</b> <br>
- The libarchive library can read and write a number of common
- cpio variants. A cpio archive stores each entry as a
- fixed-size header followed by a variable-length filename and
- variable-length data. Unlike the tar format, the cpio format
- does only minimal padding of the header or file data. There
- are several cpio variants, which differ primarily in how
- they store the initial header: some store the values as
- octal or hexadecimal numbers in ASCII, others as binary
- values of varying byte order and length.</p>
- <p style="margin-top: 1em"><b>binary</b></p>
- <p style="margin-left:17%; margin-top: 1em">The libarchive
- library transparently reads both big-endian and
- little-endian variants of the the two binary cpio formats;
- the original one from PWB/UNIX, and the later, more widely
- used, variant. This format used 32-bit binary values for
- file size and mtime, and 16-bit binary values for the other
- fields. The formats support only the file types present in
- UNIX at the time of their creation. File sizes are limited
- to 24 bits in the PWB format, because of the limits of the
- file system, and to 31 bits in the newer binary format,
- where signed 32 bit longs were used.</p>
- <p style="margin-top: 1em"><b>odc</b></p>
- <p style="margin-left:17%; margin-top: 1em">This is the
- POSIX standardized format, which is officially known as the
- “cpio interchange format” or the
- “octet-oriented cpio archive format” and
- sometimes unofficially referred to as the “old
- character format”. This format stores the header
- contents as octal values in ASCII. It is standard, portable,
- and immune from byte-order confusion. File sizes and mtime
- are limited to 33 bits (8GB file size), other fields are
- limited to 18 bits.</p>
- <p style="margin-top: 1em"><b>SVR4/newc</b></p>
- <p style="margin-left:17%;">The libarchive library can read
- both CRC and non-CRC variants of this format. The SVR4
- format uses eight-digit hexadecimal values for all header
- fields. This limits file size to 4GB, and also limits the
- mtime and other fields to 32 bits. The SVR4 format can
- optionally include a CRC of the file contents, although
- libarchive does not currently verify this CRC.</p>
- <p style="margin-left:6%; margin-top: 1em">Cpio first
- appeared in PWB/UNIX 1.0, which was released within AT&T
- in 1977. PWB/UNIX 1.0 formed the basis of System III Unix,
- released outside of AT&T in 1981. This makes cpio older
- than tar, although cpio was not included in Version 7
- AT&T Unix. As a result, the tar command became much
- better known in universities and research groups that used
- Version 7. The combination of the <b>find</b> and
- <b>cpio</b> utilities provided very precise control over
- file selection. Unfortunately, the format has many
- limitations that make it unsuitable for widespread use. Only
- the POSIX format permits files over 4GB, and its 18-bit
- limit for most other fields makes it unsuitable for modern
- systems. In addition, cpio formats only store numeric
- UID/GID values (not usernames and group names), which can
- make it very difficult to correctly transfer archives across
- systems with dissimilar user numbering.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Shar
- Formats</b> <br>
- A “shell archive” is a shell script that, when
- executed on a POSIX-compliant system, will recreate a
- collection of file system objects. The libarchive library
- can write two different kinds of shar archives:</p>
- <p style="margin-top: 1em"><b>shar</b></p>
- <p style="margin-left:17%; margin-top: 1em">The traditional
- shar format uses a limited set of POSIX commands, including
- echo(1), mkdir(1), and sed(1). It is suitable for portably
- archiving small collections of plain text files. However, it
- is not generally well-suited for large archives (many
- implementations of sh(1) have limits on the size of a
- script) nor should it be used with non-text files.</p>
- <p style="margin-top: 1em"><b>shardump</b></p>
- <p style="margin-left:17%;">This format is similar to shar
- but encodes files using uuencode(1) so that the result will
- be a plain text file regardless of the file contents. It
- also includes additional shell commands that attempt to
- reproduce as many file attributes as possible, including
- owner, mode, and flags. The additional commands used to
- restore file attributes make shardump archives less portable
- than plain shar archives.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>ISO9660
- format</b> <br>
- Libarchive can read and extract from files containing
- ISO9660-compliant CDROM images. In many cases, this can
- remove the need to burn a physical CDROM just in order to
- read the files contained in an ISO9660 image. It also avoids
- security and complexity issues that come with virtual mounts
- and loopback devices. Libarchive supports the most common
- Rockridge extensions and has partial support for Joliet
- extensions. If both extensions are present, the Joliet
- extensions will be used and the Rockridge extensions will be
- ignored. In particular, this can create problems with
- hardlinks and symlinks, which are supported by Rockridge but
- not by Joliet.</p>
- <p style="margin-left:6%; margin-top: 1em">Libarchive reads
- ISO9660 images using a streaming strategy. This allows it to
- read compressed images directly (decompressing on the fly)
- and allows it to read images directly from network sockets,
- pipes, and other non-seekable data sources. This strategy
- works well for optimized ISO9660 images created by many
- popular programs. Such programs collect all directory
- information at the beginning of the ISO9660 image so it can
- be read from a physical disk with a minimum of seeking.
- However, not all ISO9660 images can be read in this
- fashion.</p>
- <p style="margin-left:6%; margin-top: 1em">Libarchive can
- also write ISO9660 images. Such images are fully optimized
- with the directory information preceding all file data. This
- is done by storing all file data to a temporary file while
- collecting directory information in memory. When the image
- is finished, libarchive writes out the directory structure
- followed by the file data. The location used for the
- temporary file can be changed by the usual environment
- variables.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Zip
- format</b> <br>
- Libarchive can read and write zip format archives that have
- uncompressed entries and entries compressed with the
- “deflate” algorithm. Other zip compression
- algorithms are not supported. It can extract jar archives,
- archives that use Zip64 extensions and self-extracting zip
- archives. Libarchive can use either of two different
- strategies for reading Zip archives: a streaming strategy
- which is fast and can handle extremely large archives, and a
- seeking strategy which can correctly process self-extracting
- Zip archives and archives with deleted members or other
- in-place modifications.</p>
- <p style="margin-left:6%; margin-top: 1em">The streaming
- reader processes Zip archives as they are read. It can read
- archives of arbitrary size from tape or network sockets, and
- can decode Zip archives that have been separately compressed
- or encoded. However, self-extracting Zip archives and
- archives with certain types of modifications cannot be
- correctly handled. Such archives require that the reader
- first process the Central Directory, which is ordinarily
- located at the end of a Zip archive and is thus inaccessible
- to the streaming reader. If the program using libarchive has
- enabled seek support, then libarchive will use this to
- processes the central directory first.</p>
- <p style="margin-left:6%; margin-top: 1em">In particular,
- the seeking reader must be used to correctly handle
- self-extracting archives. Such archives consist of a program
- followed by a regular Zip archive. The streaming reader
- cannot parse the initial program portion, but the seeking
- reader starts by reading the Central Directory from the end
- of the archive. Similarly, Zip archives that have been
- modified in-place can have deleted entries or other garbage
- data that can only be accurately detected by first reading
- the Central Directory.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Archive
- (library) file format</b> <br>
- The Unix archive format (commonly created by the ar(1)
- archiver) is a general-purpose format which is used almost
- exclusively for object files to be read by the link editor
- ld(1). The ar format has never been standardised. There are
- two common variants: the GNU format derived from SVR4, and
- the BSD format, which first appeared in 4.4BSD. The two
- differ primarily in their handling of filenames longer than
- 15 characters: the GNU/SVR4 variant writes a filename table
- at the beginning of the archive; the BSD format stores each
- long filename in an extension area adjacent to the entry.
- Libarchive can read both extensions, including archives that
- may include both types of long filenames. Programs using
- libarchive can write GNU/SVR4 format if they provide an
- entry called <i>//</i> containing a filename table to be
- written into the archive before any of the entries. Any
- entries whose names are not in the filename table will be
- written using BSD-style long filenames. This can cause
- problems for programs such as GNU ld that do not support the
- BSD-style long filenames.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>mtree</b>
- <br>
- Libarchive can read and write files in mtree(5) format. This
- format is not a true archive format, but rather a textual
- description of a file hierarchy in which each line specifies
- the name of a file and provides specific metadata about that
- file. Libarchive can read all of the keywords supported by
- both the NetBSD and FreeBSD versions of mtree(8), although
- many of the keywords cannot currently be stored in an
- archive_entry object. When writing, libarchive supports use
- of the archive_write_set_options(3) interface to specify
- which keywords should be included in the output. If
- libarchive was compiled with access to suitable
- cryptographic libraries (such as the OpenSSL libraries), it
- can compute hash entries such as <b>sha512</b> or <b>md5</b>
- from file data being written to the mtree writer.</p>
- <p style="margin-left:6%; margin-top: 1em">When reading an
- mtree file, libarchive will locate the corresponding files
- on disk using the <b>contents</b> keyword if present or the
- regular filename. If it can locate and open the file on
- disk, it will use that to fill in any metadata that is
- missing from the mtree file and will read the file contents
- and return those to the program using libarchive. If it
- cannot locate and open the file on disk, libarchive will
- return an error for any attempt to read the entry body.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>7-Zip</b>
- <br>
- Libarchive can read and write 7-Zip format archives. TODO:
- Need more information</p>
- <p style="margin-left:6%; margin-top: 1em"><b>CAB</b> <br>
- Libarchive can read Microsoft Cabinet ( “CAB”)
- format archives. TODO: Need more information.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>LHA</b> <br>
- TODO: Information about libarchive’s LHA support</p>
- <p style="margin-left:6%; margin-top: 1em"><b>RAR</b> <br>
- Libarchive has limited support for reading RAR format
- archives. Currently, libarchive can read RARv3 format
- archives which have been either created uncompressed, or
- compressed using any of the compression methods supported by
- the RARv3 format. Libarchive can also read self-extracting
- RAR archives.</p>
- <p style="margin-left:6%; margin-top: 1em"><b>Warc</b> <br>
- Libarchive can read and write “web archives”.
- TODO: Need more information</p>
- <p style="margin-left:6%; margin-top: 1em"><b>XAR</b> <br>
- Libarchive can read and write the XAR format used by many
- Apple tools. TODO: Need more information</p>
- <p style="margin-top: 1em"><b>SEE ALSO</b></p>
- <p style="margin-left:6%;">ar(1), cpio(1), mkisofs(1),
- shar(1), tar(1), zip(1), zlib(3), cpio(5), mtree(5),
- tar(5)</p>
- <p style="margin-left:6%; margin-top: 1em">BSD
- December 27, 2016 BSD</p>
- <hr>
- </body>
- </html>
|