Global Cleanup
This project was sleeping for too long. Too many parts are not clean at all. So, here some cleanup: - add more test unit - clean msdos date/time format for zip - add crc32 - add version support - rewrite notes - rewrite the whole interface from scratch. - update github actions - update license (MIT) - create dotzip application - update notes regarding data-structure used - fix date and time ms-dos format (issue with endianess) - fix local file header encoding - update documentation - update with new extra fields and third party support - add extended timestamp third party support - add unix info new third party support
This commit is contained in:
311
notes/README.md
311
notes/README.md
@@ -1,117 +1,228 @@
|
||||
---
|
||||
---
|
||||
|
||||
This documentation is a work in progress regarding the way to use
|
||||
Dotzip Elixir module. It should:
|
||||
Dotzip Elixir module. This module should:
|
||||
|
||||
* be easy to understand (e.g. easy API)
|
||||
* compatible with Erlang/Elixir release
|
||||
* portable to any systems supported by Erlang/Elixir
|
||||
* usable as stream of data
|
||||
* offering an high level representation of the data/metadata
|
||||
* easy to debug
|
||||
* **be easy to understand (e.g. easy API)**: interfaces should follow
|
||||
OTP and/or Elixir principles. Anyone who want to use it should
|
||||
simply read introduction page. The documentation should cover 99%
|
||||
of user requirement but can offer also some "expert" feature.
|
||||
|
||||
* **be compatible with Erlang/Elixir release**: this project should
|
||||
be compatible with BEAP virtual machine and usable with other
|
||||
languages like Joxa, Clojuerl, Erlang and Elixir.
|
||||
|
||||
* **be portable to any systems supported by Erlang/Elixir**: it
|
||||
should work on any "recent" version of OTP (>R19).
|
||||
|
||||
* **be usable as stream of data**: this project should not have a
|
||||
high memory impact, if an archive is too big, it should not be a
|
||||
problem to use it in small systems.
|
||||
|
||||
* **offer an high level representation of the data/metadata**: a
|
||||
clean representation of ZIP archive should be generated and
|
||||
hackable. Anyone who want to design his own module or feature
|
||||
should have all information to do it.
|
||||
|
||||
* **have no external requirement or dependencies**: this project
|
||||
should not use any external project, except if the dependency is
|
||||
vital for the project.
|
||||
|
||||
* **be easy to debug**: parsing, encoding and decoding files can be
|
||||
quite complex, this project should offer enough function to let
|
||||
anyone debug this project and other ZIP related projects.
|
||||
|
||||
* **offer a framework**: this project is a first step to create an
|
||||
archive framework, where anyone can archive and compress data in
|
||||
any kind of format.
|
||||
|
||||
* **offer benchmark**: this project should be benchmarked and
|
||||
generate stats.
|
||||
|
||||
# Elixir
|
||||
* **offer different way to use**: the first target is to use this
|
||||
project as library but, it could be nice to use it as compression
|
||||
daemon and/or system tool.
|
||||
|
||||
# Dotzip Documentation Draft
|
||||
|
||||
(Work in progress) Dotzip can be used as library or as OTP
|
||||
application. As library, Dotzip act as a highlevel interface for
|
||||
creating Zip files. As OTP application, Dotzip act as a framework to
|
||||
create, analyze or extract Zip archives by using optimized
|
||||
functions. To use it as application, users will need to start `Dotzip`
|
||||
application.
|
||||
|
||||
```
|
||||
Application.start(:dotzip)
|
||||
```
|
||||
|
||||
(Work in progress) One can also stop it.
|
||||
|
||||
```
|
||||
Application.stop(:dotzip)
|
||||
```
|
||||
|
||||
## Dotzip Library
|
||||
|
||||
(Work in progress) To decode a Zip file from bitstring, one can use
|
||||
`Dotzip.decode/1` or `Dotzip.decode/2` functions.
|
||||
|
||||
```elixir
|
||||
{:ok, dotzip} = Dotzip.decode(bitstring)
|
||||
```
|
||||
|
||||
(Work in progress) In another hand, to encode abstract Dotzip data
|
||||
structure as Zip file, one can use `Dotzip.encode/1` or
|
||||
`Dotzip.encode/2` functions.
|
||||
|
||||
```elixir
|
||||
{:ok, bitstring} = Dotzip.encode(dotzip)
|
||||
```
|
||||
|
||||
(Work in progress) The structure used must be easy to understand and
|
||||
should contain all information required. A Zip file is mainly divided
|
||||
in 2 parts, a central directory record containing global information
|
||||
about the zip file, and a list of files, each one with their own
|
||||
header.
|
||||
|
||||
NOTE: static data-structures vs dynamic data-structures, here two
|
||||
worlds are colliding, a strict decomposition of the data can be done
|
||||
by using `tuples` or by using `maps`. Using `tuples` can be used on
|
||||
practically any version of OTP but will require more work on the
|
||||
library. In other hand, using `maps` can help to design a flexible
|
||||
library but old OTP versions will be impacted. The first
|
||||
implementation will use a mix between tuples and maps, all important
|
||||
Dotzip datastructures will be tagged with `:dotzip_*` tag.
|
||||
|
||||
All the following part is a draft.
|
||||
|
||||
### File(s) Structure(s)
|
||||
|
||||
To be defined
|
||||
|
||||
```elixir
|
||||
@type dotzip_encryption_header() :: %{}
|
||||
@type dotzip_file_data() :: <<>> | {:dotzip_file_ref, <<>>}
|
||||
@type dotzip_data_description() :: %{}
|
||||
```
|
||||
|
||||
```elixir
|
||||
@type dotzip_file() :: {:dotzip_file,
|
||||
%{ dotzip_file_header,
|
||||
:dotzip_encryption_header => dotzip_encryption_header(),
|
||||
:dotzip_file_data => dotzip_file_data(),
|
||||
:dotzip_data_descriptor => dotzip_data_descriptor()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
```elixir
|
||||
@type dotzip_files() :: [dotzip_file(), ...]
|
||||
```
|
||||
|
||||
### Central Directory Record Structure(s)
|
||||
|
||||
To be defined
|
||||
|
||||
```elixir
|
||||
@type dotzip_central_directory_record() :: %{
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
```elixir
|
||||
@typedoc ""
|
||||
@type dotzip_struct() :: {:dotzip,
|
||||
%{
|
||||
:dotzip_central_directory_record => dotzip_central_directory_record,
|
||||
:dotzip_files => dotzip_files
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## ZIP File Extraction and Analysis
|
||||
|
||||
(Work in progress) A Zip file can contain many files, and sometime,
|
||||
big one. To avoid using the whole memory of the system, Dotzip can
|
||||
load only metadata instead of the whole archive by using
|
||||
`Dotzip.preload/1` or `Dotzip.preload/2` functions.
|
||||
|
||||
```elixir
|
||||
{:ok, reference_preload} = Dotzip.preload("/path/to/archive.zip")
|
||||
```
|
||||
|
||||
(Work in progress) In other hand, a file can be fully loaded by using
|
||||
`Dotzip.load/1` or `Dotzip.load/2` functions.
|
||||
|
||||
```elixir
|
||||
{:ok, reference_load} = Dotzip.load("/path/to/archive.zip")
|
||||
```
|
||||
|
||||
(Work in progress) Dotzip can analyze the content of the archive by
|
||||
using `Dotzip.analyze/1` or `Dotzip.analyze/2` functions. These
|
||||
functions will ensure the file is in good state or alert if something
|
||||
is not correct. `Dotzip.analyze` features may be extended by using
|
||||
creating `Dotzip.Analyzer`.
|
||||
|
||||
```elixir
|
||||
{:ok, analysis} = Dotzip.analyze(reference)
|
||||
```
|
||||
|
||||
(Work in progress) The whole archive can be extracted by using
|
||||
`Dotzip.extract/2` or `Dotzip.extract/3` functions.
|
||||
|
||||
```elixir
|
||||
{:ok, info} = Dotzip.extract(reference, "/path/to/extract")
|
||||
{:ok, info} = Dotzip.extract(reference, "/other/path/to/extract", verbose: true)
|
||||
```
|
||||
|
||||
(Work in progress) When a file is not required anymore, this file can
|
||||
be unloaded by using `Dotzip.unload/1` function. Both the path of the
|
||||
archive or the reference can be used.
|
||||
|
||||
```elixir
|
||||
:ok = Dotzip.unload("/path/to/archive.zip")
|
||||
:ok = Dotzip.unload(reference)
|
||||
```
|
||||
|
||||
## ZIP File Creation
|
||||
|
||||
Some example of the usage. Creating a zip file should be easy and only
|
||||
based on a simple object creation.
|
||||
(Work in progress) Some example of the usage. Creating a zip file
|
||||
should be easy and only based on a simple object creation. To create a
|
||||
new empty archive, `Dotzip.new/0` or `Dotzip.new/1` functions can be
|
||||
used.
|
||||
|
||||
```elixir
|
||||
Dotzip.new()
|
||||
|> Dotzip.to_binary()
|
||||
reference = Dotzip.new()
|
||||
```
|
||||
|
||||
Adding file should also be easy. Those files are loaded only when the
|
||||
file is converted in binary.
|
||||
(Work in progress) Adding files must also be quite
|
||||
easy. `Dotzip.add/2` or `Dotzip.add/3` functions can be used to add
|
||||
files based on different sources. By default, absolute paths are
|
||||
converted to relavative path by removing the root part of the path.
|
||||
|
||||
```elixir
|
||||
Dotzip.new()
|
||||
|> Dotzip.file("/path/to/file/one", "/one")
|
||||
|> Dotzip.file("/path/to/file/two", "/two")
|
||||
|> Dotzip.to_binary()
|
||||
# add a file from absolute path
|
||||
{:ok, info} = Dotzip.add(reference, "/path/to/my/file")
|
||||
|
||||
# add a directory and its whole content from absolute path
|
||||
{:ok, info} = Dotzip.add(reference, "/path/to/my/directory", recursive: true)
|
||||
|
||||
# create a new directory
|
||||
{:ok, info} = Dotzip.add(reference, {:directory, "/my/directory"})
|
||||
|
||||
# create a new file in archive from bitstring
|
||||
{:ok, info} = Dotzip.add(reference, {:raw, "/my/file", "content\n"}", compression: :lz4)
|
||||
|
||||
# create a new file from external url
|
||||
{:ok, info} = Dotzip.add(reference, {:url, "/my/other/file", "https://my.super.site.com/file"})
|
||||
```
|
||||
|
||||
It should also be possible to add recursively the content of a
|
||||
directory.
|
||||
(Work in progress) The whole archive can also share some specific
|
||||
options, like encryption or compression.
|
||||
|
||||
```elixir
|
||||
Dotzip.new()
|
||||
|> Dotzip.directory("/path/to/directory", recursive: true)
|
||||
|> Dotzip.to_binary()
|
||||
```
|
||||
|
||||
A blob is any kind of data direcly stored in memory, from the BEAM.
|
||||
|
||||
```elixir
|
||||
Dotzip.new()
|
||||
|> Dotzip.blob("my raw data here", "/file_path")
|
||||
|> Dotzip.blob("another content", "/file_path2")
|
||||
|> Dotzip.to_binary()
|
||||
```
|
||||
|
||||
The option of the zip file can be added directly when the zip is
|
||||
created.
|
||||
|
||||
```elixir
|
||||
Dotzip.new(compression: :unshrink)
|
||||
```
|
||||
|
||||
A list of supported compression methods can be found directly in the
|
||||
library.
|
||||
|
||||
```elixir
|
||||
Dotzip.compression_methods()
|
||||
```
|
||||
|
||||
Encrypted archive should also be made during the ZIP file creation.
|
||||
|
||||
```elixir
|
||||
Dotzip.new(encryption: :aes_cbc256)
|
||||
```
|
||||
|
||||
or by configuring it after the object was created.
|
||||
|
||||
```elixir
|
||||
Dotzip.new()
|
||||
|> Dotzip.hash(:md5)
|
||||
|> Dotzip.encryption(:aes_cbc256, password: "my_password")
|
||||
```
|
||||
|
||||
Supported method can be printed.
|
||||
|
||||
```elixir
|
||||
Dotzip.encryption_methods()
|
||||
```
|
||||
|
||||
## ZIP File Extraction
|
||||
|
||||
Extract all file from a local archive, present on the filesystem.
|
||||
|
||||
```elixir
|
||||
Dotzip.open_file("/path/to/file.zip")
|
||||
|> Dotzip.extract_all()
|
||||
```
|
||||
|
||||
Extract only one or many files from the local archive.
|
||||
|
||||
```elixir
|
||||
Dotzip.open_file("/path/to/file.zip")
|
||||
|> Dotzip.extract("/path/compressed/file")
|
||||
|> Dotzip.extract("/path/to/compressed.data")
|
||||
```
|
||||
|
||||
Convert the full archive in erlang/elixir term.
|
||||
|
||||
```elixir
|
||||
Dotzip.open_file("/path/to/file.zip")
|
||||
|> Dotzip.to_term()
|
||||
```
|
||||
|
||||
Convert a stream archive to erlang/elixir term.
|
||||
|
||||
```elixir
|
||||
Dotzip.open_stream(mydata)
|
||||
|> Dotzip.to_term()
|
||||
# set compression to lz4
|
||||
Dotzip.set(reference, compression: :lz4)
|
||||
|
||||
Dotzip.set(reference, encryption: :aes_cbc256)
|
||||
Dotzip.set(reference, passphrase: "my passphrase")
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user