aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--docs/tar_pax.txt239
1 files changed, 239 insertions, 0 deletions
diff --git a/docs/tar_pax.txt b/docs/tar_pax.txt
new file mode 100644
index 000000000..8a3f1e755
--- /dev/null
+++ b/docs/tar_pax.txt
@@ -0,0 +1,239 @@
+'pax headers' is POSIX 2003 (iirc) addition designed to fix
+tar format limitations - older tar format has fixed fields
+for everything (filename, uid, filesize etc) which can overflow.
+
+pax Header Block
+
+The pax header block shall be identical to the ustar header block
+described in ustar Interchange Format, except that two additional
+typeflag values are defined:
+
+x
+ Represents extended header records for the following file in
+the archive (which shall have its own ustar header block).
+
+g
+ Represents global extended header records for the following
+files in the archive. Each value shall affect all subsequent files
+that do not override that value in their own extended header
+record and until another global extended header record is reached
+that provides another value for the same field. The typeflag g
+global headers should not be used with interchange media that
+could suffer partial data loss in transporting the archive.
+
+For both of these types, the size field shall be the size of the
+extended header records in octets. The other fields in the header
+block are not meaningful to this version of the pax utility.
+However, if this archive is read by a pax utility conforming to
+the ISO POSIX-2:1993 standard, the header block fields are used to
+create a regular file that contains the extended header records as
+data. Therefore, header block field values should be selected to
+provide reasonable file access to this regular file.
+
+A further difference from the ustar header block is that data
+blocks for files of typeflag 1 (the digit one) (hard link) may be
+included, which means that the size field may be greater than
+zero.
+
+pax Extended Header
+
+An extended header shall consist of one or more records, each
+constructed as follows:
+
+"%d %s=%s\n", <length>, <keyword>, <value>
+
+The <length> field shall be the decimal length of the extended
+header record in octets, including length string itself and the
+trailing <newline>.
+
+[skip]
+
+atime
+ The file access time for the following file(s), equivalent to
+the value of the st_atime member of the stat structure for a file,
+as described by the stat() function. The access time shall be
+restored if the process has the appropriate privilege required to
+do so. The format of the <value> shall be as described in pax
+Extended Header File Times.
+
+charset
+ The name of the character set used to encode the data in the
+following file(s).
+
+ The encoding is included in an extended header for information
+only; when pax is used as described in IEEE Std 1003.1-2001, it
+shall not translate the file data into any other encoding. The
+BINARY entry indicates unencoded binary data.
+
+ When used in write or copy mode, it is implementation-defined
+whether pax includes a charset extended header record for a file.
+
+comment
+ A series of characters used as a comment. All characters in
+the <value> field shall be ignored by pax.
+
+gid
+ The group ID of the group that owns the file, expressed as a
+decimal number using digits from the ISO/IEC 646:1991 standard.
+This record shall override the gid field in the following header
+block(s). When used in write or copy mode, pax shall include a gid
+extended header record for each file whose group ID is greater
+than 2097151 (octal 7777777).
+
+gname
+ The group of the file(s), formatted as a group name in the
+group database. This record shall override the gid and gname
+fields in the following header block(s), and any gid extended
+header record. When used in read, copy, or list mode, pax shall
+translate the name from the UTF-8 encoding in the header record to
+the character set appropriate for the group database on the
+receiving system. If any of the UTF-8 characters cannot be
+translated, and if the -o invalid= UTF-8 option is not specified,
+the results are implementation-defined. When used in write or copy
+mode, pax shall include a gname extended header record for each
+file whose group name cannot be represented entirely with the
+letters and digits of the portable character set.
+
+linkpath
+ The pathname of a link being created to another file, of any
+type, previously archived. This record shall override the linkname
+field in the following ustar header block(s). The following ustar
+header block shall determine the type of link created. If typeflag
+of the following header block is 1, it shall be a hard link. If
+typeflag is 2, it shall be a symbolic link and the linkpath value
+shall be the contents of the symbolic link. The pax utility shall
+translate the name of the link (contents of the symbolic link)
+from the UTF-8 encoding to the character set appropriate for the
+local file system. When used in write or copy mode, pax shall
+include a linkpath extended header record for each link whose
+pathname cannot be represented entirely with the members of the
+portable character set other than NUL.
+
+mtime
+ The file modification time of the following file(s),
+equivalent to the value of the st_mtime member of the stat
+structure for a file, as described in the stat() function. This
+record shall override the mtime field in the following header
+block(s). The modification time shall be restored if the process
+has the appropriate privilege required to do so. The format of the
+<value> shall be as described in pax Extended Header File Times.
+
+path
+ The pathname of the following file(s). This record shall
+override the name and prefix fields in the following header
+block(s). The pax utility shall translate the pathname of the file
+from the UTF-8 encoding to the character set appropriate for the
+local file system.
+
+ When used in write or copy mode, pax shall include a path
+extended header record for each file whose pathname cannot be
+represented entirely with the members of the portable character
+set other than NUL.
+
+realtime.any
+ The keywords prefixed by "realtime." are reserved for future
+standardization.
+
+security.any
+ The keywords prefixed by "security." are reserved for future
+standardization.
+
+size
+ The size of the file in octets, expressed as a decimal number
+using digits from the ISO/IEC 646:1991 standard. This record shall
+override the size field in the following header block(s). When
+used in write or copy mode, pax shall include a size extended
+header record for each file with a size value greater than
+8589934591 (octal 77777777777).
+
+uid
+ The user ID of the file owner, expressed as a decimal number
+using digits from the ISO/IEC 646:1991 standard. This record shall
+override the uid field in the following header block(s). When used
+in write or copy mode, pax shall include a uid extended header
+record for each file whose owner ID is greater than 2097151 (octal
+7777777).
+
+uname
+ The owner of the following file(s), formatted as a user name
+in the user database. This record shall override the uid and uname
+fields in the following header block(s), and any uid extended
+header record. When used in read, copy, or list mode, pax shall
+translate the name from the UTF-8 encoding in the header record to
+the character set appropriate for the user database on the
+receiving system. If any of the UTF-8 characters cannot be
+translated, and if the -o invalid= UTF-8 option is not specified,
+the results are implementation-defined. When used in write or copy
+mode, pax shall include a uname extended header record for each
+file whose user name cannot be represented entirely with the
+letters and digits of the portable character set.
+
+If the <value> field is zero length, it shall delete any header
+block field, previously entered extended header value, or global
+extended header value of the same name.
+
+If a keyword in an extended header record (or in a -o
+option-argument) overrides or deletes a corresponding field in the
+ustar header block, pax shall ignore the contents of that header
+block field.
+
+Unlike the ustar header block fields, NULs shall not delimit
+<value>s; all characters within the <value> field shall be
+considered data for the field. None of the length limitations of
+the ustar header block fields in ustar Header Block shall apply to
+the extended header records.
+
+pax Extended Header File Times
+
+Time records shall be formatted as a decimal representation of the
+time in seconds since the Epoch. If a period ( '.' ) decimal point
+character is present, the digits to the right of the point shall
+represent the units of a subsecond timing granularity. In read or
+copy mode, the pax utility shall truncate the time of a file to
+the greatest value that is not greater than the input header
+file time. In write or copy mode, the pax utility shall output a
+time exactly if it can be represented exactly as a decimal number,
+and otherwise shall generate only enough digits so that the same
+time shall be recovered if the file is extracted on a system whose
+underlying implementation supports the same time granularity.
+
+Example from Linux kernel archive tarball:
+
+00000000 70 61 78 5f 67 6c 6f 62 61 6c 5f 68 65 61 64 65 |pax_global_heade|
+00000010 72 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |r...............|
+00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000060 00 00 00 00 30 30 30 30 36 36 36 00 30 30 30 30 |....0000666.0000|
+00000070 30 30 30 00 30 30 30 30 30 30 30 00 30 30 30 30 |000.0000000.0000|
+00000080 30 30 30 30 30 36 34 00 30 30 30 30 30 30 30 30 |0000064.00000000|
+00000090 30 30 30 00 30 30 31 34 30 35 33 00 67 00 00 00 |000.0014053.g...|
+000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000100 00 75 73 74 61 72 00 30 30 67 69 74 00 00 00 00 |.ustar.00git....|
+00000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000120 00 00 00 00 00 00 00 00 00 67 69 74 00 00 00 00 |.........git....|
+00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000140 00 00 00 00 00 00 00 00 00 30 30 30 30 30 30 30 |.........0000000|
+00000150 00 30 30 30 30 30 30 30 00 00 00 00 00 00 00 00 |.0000000........|
+00000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000001c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+00000200 35 32 20 63 6f 6d 6d 65 6e 74 3d 62 31 30 35 30 |52 comment=b1050|
+00000210 32 62 32 32 61 31 32 30 39 64 36 62 34 37 36 33 |2b22a1209d6b4763|
+00000220 39 64 38 38 62 38 31 32 62 32 31 66 62 35 39 34 |9d88b812b21fb594|
+00000230 39 65 34 0a 00 00 00 00 00 00 00 00 00 00 00 00 |9e4.............|
+00000240 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
+...