mirror of
https://github.com/isar/libmdbx.git
synced 2024-10-29 23:19:20 +08:00
mdbx: update README (MacOS support).
Change-Id: Id85b79fb605702fff606b62a5114951bfb9cb22e
This commit is contained in:
parent
e04bfc05fa
commit
887cbc7f00
70
README-RU.md
70
README-RU.md
@ -1,4 +1,4 @@
|
||||
## The [repository was moved out from Github](https://abf.io/erthink/libmdbx) due to illegal discriminatory restrictions for Russian Crimea and for sovereign crimeans.
|
||||
## The [repository now only mirrored on the Github](https://abf.io/erthink/libmdbx) due to illegal discriminatory restrictions for Russian Crimea and for sovereign crimeans.
|
||||
<!-- Required extensions: pymdownx.betterem, pymdownx.tilde, pymdownx.emoji, pymdownx.tasklist, pymdownx.superfences -->
|
||||
---
|
||||
|
||||
@ -6,31 +6,24 @@ libmdbx
|
||||
======================================
|
||||
**The revised and extended descendant of [Symas LMDB](https://symas.com/lmdb/).**
|
||||
|
||||
*The Future will Positive. Всё будет хорошо.*
|
||||
*The Future will (be) Positive. Всё будет хорошо.*
|
||||
[![Build Status](https://travis-ci.org/leo-yuriev/libmdbx.svg?branch=master)](https://travis-ci.org/leo-yuriev/libmdbx)
|
||||
[![Build status](https://ci.appveyor.com/api/projects/status/ue94mlopn50dqiqg/branch/master?svg=true)](https://ci.appveyor.com/project/leo-yuriev/libmdbx/branch/master)
|
||||
[![Coverity Scan Status](https://scan.coverity.com/projects/12915/badge.svg)](https://scan.coverity.com/projects/reopen-libmdbx)
|
||||
|
||||
English version [by Google](https://translate.googleusercontent.com/translate_c?act=url&ie=UTF8&sl=ru&tl=en&u=https://github.com/leo-yuriev/libmdbx/tree/master)
|
||||
English version of this README is [here](README.md), also the translations [by Google](https://translate.googleusercontent.com/translate_c?act=url&ie=UTF8&sl=ru&tl=en&u=https://github.com/leo-yuriev/libmdbx/tree/master)
|
||||
and [by Yandex](https://translate.yandex.ru/translate?url=https%3A%2F%2Fgithub.com%2FReOpen%2Flibmdbx%2Ftree%2Fmaster&lang=ru-en).
|
||||
|
||||
### Project Status
|
||||
### Статус проекта
|
||||
|
||||
**Сейчас MDBX _активно перерабатывается_** предстоит
|
||||
большое изменение как API, так и формата базы данных. К сожалению,
|
||||
обновление приведет к потере совместимости с предыдущими версиями.
|
||||
_libmdbx_ работает на Linux, FreeBSD, MacOS X и других ОС
|
||||
соответствующих POSIX.1-2008, а также поддерживает Windows (начиная с
|
||||
Windows XP) в качестве дополнительной платформы.
|
||||
|
||||
Цель этой революции - обеспечение более четкого надежного API и
|
||||
добавление новых функции, а также наделение базы данных новыми
|
||||
свойствами.
|
||||
|
||||
В настоящее время MDBX работает на Linux и ОС соответствующих
|
||||
POSIX.1-2008, а также поддерживает Windows (начиная с Windows XP) в
|
||||
качестве дополнительной платформы. Поддержка других ОС может быть
|
||||
обеспечена на коммерческой основе. Однако такие усовершенствования (т.
|
||||
е. pull-requests) могут быть приняты в мейнстрим только в том случае,
|
||||
если будет доступен соответствующий публичный и бесплатный сервис
|
||||
непрерывной интеграции (aka Continuous Integration).
|
||||
Отдельно ведётся не-публичная разработка следующей версии, в которой
|
||||
будет большое изменение как API, так и формата базы данных. Цель этой
|
||||
революции - обеспечение более четкого и надежного API, добавление новых
|
||||
функций, а также наделение базы данных новыми свойствами.
|
||||
|
||||
## Содержание
|
||||
- [Обзор](#Обзор)
|
||||
@ -53,8 +46,7 @@ POSIX.1-2008, а также поддерживает Windows (начиная с
|
||||
## Обзор
|
||||
_libmdbx_ - это встраиваемый key-value движок хранения со специфическим
|
||||
набором свойств и возможностей, ориентированный на создание уникальных
|
||||
легковесных решений с предельной производительностью под Linux и
|
||||
Windows.
|
||||
легковесных решений с предельной производительностью.
|
||||
|
||||
_libmdbx_ позволяет множеству процессов совместно читать и обновлять
|
||||
несколько key-value таблиц с соблюдением
|
||||
@ -84,10 +76,11 @@ _libmdbx_ не использует
|
||||
|
||||
|
||||
### Сравнение с другими СУБД
|
||||
Ввиду того, что в _libmdbx_ сейчас происходит революция, я посчитал
|
||||
лучшим решением ограничится здесь ссылкой на [главу Comparison with
|
||||
other databases](https://github.com/coreos/bbolt#comparison-with-other-databases)
|
||||
в описании _BoltDB_.
|
||||
|
||||
На данный момент, пожалуйста, обратитесь к [главе "сравнение BoltDB с
|
||||
другими базами
|
||||
данных"](https://github.com/coreos/bbolt#comparison-with-other-databases),
|
||||
которая также (в основном) применима к MDBX.
|
||||
|
||||
|
||||
### История
|
||||
@ -108,13 +101,13 @@ Tables](https://github.com/leo-yuriev/libfpta), aka ["Позитивные
|
||||
Technologies](https://www.ptsecurity.ru).
|
||||
|
||||
|
||||
#### Acknowledgments
|
||||
Howard Chu (Symas Corporation) - the author of LMDB, from which
|
||||
originated the MDBX in 2015.
|
||||
#### Выражение признательности
|
||||
|
||||
Martin Hedenfalk <martin@bzero.se> - the author of `btree.c` code, which
|
||||
was used for begin development of LMDB.
|
||||
Говард Чу (Howard Chu) <hyc@openldap.org> - автор движка LMDB, от
|
||||
которого в 2015 году произошел MDBX.
|
||||
|
||||
Мартин Хеденфальк (Martin Hedenfalk) <martin@bzero.se> - автор кода
|
||||
`btree.c`, который использовался для начала разработки LMDB.
|
||||
|
||||
Основные свойства
|
||||
=================
|
||||
@ -332,6 +325,25 @@ Amplification Factor) и RAF (Read Amplification Factor) также Olog(N).
|
||||
> - попытки повторного освобождения памяти;
|
||||
> - повреждение памяти и ошибки сегментации.
|
||||
|
||||
32. На **MacOS X** для синхронизации данных с диском _по-умолчанию_
|
||||
используется системная функция `fcntl(F_FULLFSYNC)`, так как [только
|
||||
этим гарантируется сохранность
|
||||
данных](https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/fsync.2.html)
|
||||
при сбое электропитания. К сожалению, в сценариях с высокой
|
||||
интенсивностью пишущих транзакций, использование `F_FULLFSYNC` приводит
|
||||
к существенной деградации производительности в сравнении с LMDB, где
|
||||
используется системная функция `fsync()`. Поэтому _libmdbx_ позволяет
|
||||
переопределить это поведение определением опции
|
||||
`MDBX_OSX_SPEED_INSTEADOF_DURABILITY=1` при сборке библиотеки.
|
||||
|
||||
33. На **Windows** _libmdbx_ использует файловые блокировки
|
||||
`LockFileEx()`, так как это позволяет размещать БД на сетевых дисках, а
|
||||
также обеспечивает защиту от некомпетентных действий пользователя
|
||||
([защиту от
|
||||
дурака](https://ru.wikipedia.org/wiki/%D0%97%D0%B0%D1%89%D0%B8%D1%82%D0%B0_%D0%BE%D1%82_%D0%B4%D1%83%D1%80%D0%B0%D0%BA%D0%B0)).
|
||||
Поэтому _libmdbx_ может немного отставать в тестах производительность от
|
||||
LMDB, где используются именованные мьютексы.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
## Недостатки и Компромиссы
|
||||
|
153
README.md
153
README.md
@ -1,4 +1,4 @@
|
||||
## The [repository was moved out from Github](https://abf.io/erthink/libmdbx) due to illegal discriminatory restrictions for Russian Crimea and for sovereign crimeans.
|
||||
## The [repository now only mirrored on the Github](https://abf.io/erthink/libmdbx) due to illegal discriminatory restrictions for Russian Crimea and for sovereign crimeans.
|
||||
<!-- Required extensions: pymdownx.betterem, pymdownx.tilde, pymdownx.emoji, pymdownx.tasklist, pymdownx.superfences -->
|
||||
---
|
||||
|
||||
@ -6,52 +6,33 @@ libmdbx
|
||||
======================================
|
||||
**Revised and extended descendant of [Symas LMDB](https://symas.com/lmdb/).**
|
||||
|
||||
*The Future will be positive.*
|
||||
*The Future will (be) positive.*
|
||||
[![Build Status](https://travis-ci.org/leo-yuriev/libmdbx.svg?branch=master)](https://travis-ci.org/leo-yuriev/libmdbx)
|
||||
[![Build status](https://ci.appveyor.com/api/projects/status/ue94mlopn50dqiqg/branch/master?svg=true)](https://ci.appveyor.com/project/leo-yuriev/libmdbx/branch/master)
|
||||
[![Coverity Scan Status](https://scan.coverity.com/projects/12915/badge.svg)](https://scan.coverity.com/projects/reopen-libmdbx)
|
||||
|
||||
## Project Status for now
|
||||
Русскоязычная версия этого README [здесь](README-RU.md).
|
||||
|
||||
- The stable versions
|
||||
([_stable/0.0_](https://github.com/leo-yuriev/libmdbx/tree/stable/0.0)
|
||||
and
|
||||
[_stable/0.1_](https://github.com/leo-yuriev/libmdbx/tree/stable/0.1)
|
||||
branches) of _MDBX_ are frozen, i.e. no new features or API changes, but
|
||||
only bug fixes.
|
||||
## Project Status
|
||||
|
||||
- The next version
|
||||
([_devel_](https://github.com/leo-yuriev/libmdbx/tree/devel) branch)
|
||||
**is under active non-public development**, i.e. current API and set of
|
||||
features are extreme volatile.
|
||||
_libmdbx_ works on Linux, FreeBSD, MacOS X and other systems compliant
|
||||
with POSIX.1-2008, but also support Windows as a complementary platform.
|
||||
|
||||
- The immediate goal of development is formation of the stable API and
|
||||
the stable internal database format, which allows realise all PLANNED
|
||||
FEATURES:
|
||||
1. Integrity check by [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree);
|
||||
2. Support for [raw block devices](https://en.wikipedia.org/wiki/Raw_device);
|
||||
3. Separate place (HDD) for large data items;
|
||||
4. Using "[Roaring bitmaps](http://roaringbitmap.org/about/)" inside garbage collector;
|
||||
5. Non-sequential reclaiming, like PostgreSQL's [Vacuum](https://www.postgresql.org/docs/9.1/static/sql-vacuum.html);
|
||||
6. [Asynchronous lazy data flushing](https://sites.fas.harvard.edu/~cs265/papers/kathuria-2008.pdf) to disk(s);
|
||||
7. etc...
|
||||
The next version
|
||||
([_devel_](https://github.com/leo-yuriev/libmdbx/tree/devel) branch) is
|
||||
under active non-public development, i.e. API and set of features are
|
||||
volatile. The goal of this revolution is to provide a clearer and more
|
||||
reliable API, adding set of features and a new database properties.
|
||||
|
||||
Don't miss libmdbx for other runtimes.
|
||||
Don't miss libmdbx for other runtimes:
|
||||
|
||||
| Runtime | GitHub | Author |
|
||||
| ------------- | ------------- | ------------- |
|
||||
| JVM | [mdbxjni](https://github.com/castortech/mdbxjni) | [Castor Technologies](https://castortech.com/) |
|
||||
| .NET | [mdbx.NET](https://github.com/wangjia184/mdbx.NET) | [Jerry Wang](https://github.com/wangjia184) |
|
||||
| Runtime | GitHub | Author |
|
||||
| ------------- | ------------- | ------------- |
|
||||
| Java | [mdbxjni](https://github.com/castortech/mdbxjni) | [Castor Technologies](https://castortech.com/) |
|
||||
| .NET | [mdbx.NET](https://github.com/wangjia184/mdbx.NET) | [Jerry Wang](https://github.com/wangjia184) |
|
||||
|
||||
-----
|
||||
|
||||
Nowadays MDBX works on Linux and OS'es compliant with POSIX.1-2008, but
|
||||
also support Windows (since Windows XP) as a complementary platform.
|
||||
Support for other OS could be implemented on commercial basis. However
|
||||
such enhancements (i.e. pull requests) could be accepted in mainstream
|
||||
only when corresponding public and free Continuous Integration service
|
||||
will be available.
|
||||
|
||||
## Contents
|
||||
- [Overview](#overview)
|
||||
- [Comparison with other DBs](#comparison-with-other-dbs)
|
||||
@ -72,21 +53,28 @@ will be available.
|
||||
|
||||
## Overview
|
||||
_libmdbx_ is an embedded lightweight key-value database engine oriented
|
||||
for performance under Linux and Windows.
|
||||
for performance.
|
||||
|
||||
_libmdbx_ allows multiple processes to read and update several key-value
|
||||
tables concurrently, while being
|
||||
[ACID](https://en.wikipedia.org/wiki/ACID)-compliant, with minimal
|
||||
overhead and Olog(N) operation cost.
|
||||
|
||||
_libmdbx_ enforce [serializability](https://en.wikipedia.org/wiki/Serializability) for writers by single [mutex](https://en.wikipedia.org/wiki/Mutual_exclusion) and affords [wait-free](https://en.wikipedia.org/wiki/Non-blocking_algorithm#Wait-freedom) for parallel readers without atomic/interlocked operations, while writing and reading transactions do not block each other.
|
||||
_libmdbx_ enforce
|
||||
[serializability](https://en.wikipedia.org/wiki/Serializability) for
|
||||
writers by single
|
||||
[mutex](https://en.wikipedia.org/wiki/Mutual_exclusion) and affords
|
||||
[wait-free](https://en.wikipedia.org/wiki/Non-blocking_algorithm#Wait-freedom)
|
||||
for parallel readers without atomic/interlocked operations, while
|
||||
writing and reading transactions do not block each other.
|
||||
|
||||
_libmdbx_ can guarantee consistency after crash depending of operation mode.
|
||||
_libmdbx_ can guarantee consistency after crash depending of operation
|
||||
mode.
|
||||
|
||||
_libmdbx_ uses [B+Trees](https://en.wikipedia.org/wiki/B%2B_tree) and
|
||||
[Memory-Mapping](https://en.wikipedia.org/wiki/Memory-mapped_file), doesn't use
|
||||
[WAL](https://en.wikipedia.org/wiki/Write-ahead_logging) which
|
||||
might be a caveat for some workloads.
|
||||
[Memory-Mapping](https://en.wikipedia.org/wiki/Memory-mapped_file),
|
||||
doesn't use [WAL](https://en.wikipedia.org/wiki/Write-ahead_logging)
|
||||
which might be a caveat for some workloads.
|
||||
|
||||
### Comparison with other DBs
|
||||
For now please refer to [chapter of "BoltDB comparison with other
|
||||
@ -96,15 +84,17 @@ which is also (mostly) applicable to MDBX.
|
||||
### History
|
||||
The _libmdbx_ design is based on [Lightning Memory-Mapped
|
||||
Database](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database).
|
||||
Initial development was going in [ReOpenLDAP](https://github.com/leo-yuriev/ReOpenLDAP) project.
|
||||
About a year later libmdbx was isolated to separate project, which was [presented at Highload++
|
||||
2015 conference](http://www.highload.ru/2015/abstracts/1831.html).
|
||||
Initial development was going in
|
||||
[ReOpenLDAP](https://github.com/leo-yuriev/ReOpenLDAP) project. About a
|
||||
year later libmdbx was isolated to separate project, which was
|
||||
[presented at Highload++ 2015
|
||||
conference](http://www.highload.ru/2015/abstracts/1831.html).
|
||||
|
||||
Since early 2017 _libmdbx_ is used in [Fast Positive Tables](https://github.com/leo-yuriev/libfpta),
|
||||
and development is funded by [Positive Technologies](https://www.ptsecurity.com).
|
||||
|
||||
#### Acknowledgments
|
||||
Howard Chu (Symas Corporation) - the author of LMDB, from which
|
||||
Howard Chu <hyc@openldap.org> - the author of LMDB, from which
|
||||
originated the MDBX in 2015.
|
||||
|
||||
Martin Hedenfalk <martin@bzero.se> - the author of `btree.c` code, which
|
||||
@ -184,20 +174,23 @@ additional resources for that.
|
||||
[BBWC](https://en.wikipedia.org/wiki/Disk_buffer#Write_acceleration)
|
||||
this may greatly improve write performance.
|
||||
|
||||
4. Fast estimation of range query result size via functions `mdbx_estimate_range()`,
|
||||
`mdbx_estimate_move()` and `mdbx_estimate_distance()`. E.g. for selection the
|
||||
optimal query execution plan.
|
||||
4. Fast estimation of range query result size via functions
|
||||
`mdbx_estimate_range()`, `mdbx_estimate_move()` and
|
||||
`mdbx_estimate_distance()`. E.g. for selection the optimal query
|
||||
execution plan.
|
||||
|
||||
5. `mdbx_chk` tool for DB integrity check.
|
||||
|
||||
6. Support for keys and values of zero length, including multi-values (aka sorted duplicates).
|
||||
6. Support for keys and values of zero length, including multi-values
|
||||
(aka sorted duplicates).
|
||||
|
||||
7. Ability to assign up to 3 persistent 64-bit markers to commiting transaction with
|
||||
`mdbx_canary_put()` and then get them in read transaction by
|
||||
`mdbx_canary_get()`.
|
||||
7. Ability to assign up to 3 persistent 64-bit markers to commiting
|
||||
transaction with `mdbx_canary_put()` and then get them in read
|
||||
transaction by `mdbx_canary_get()`.
|
||||
|
||||
8. Ability to update or delete record and get previous value via `mdbx_replace()`.
|
||||
Also allows update the specific item from multi-value with the same key.
|
||||
8. Ability to update or delete record and get previous value via
|
||||
`mdbx_replace()`. Also allows update the specific item from multi-value
|
||||
with the same key.
|
||||
|
||||
9. Sequence generation via `mdbx_dbi_sequence()`.
|
||||
|
||||
@ -297,6 +290,24 @@ to avoid hard-to-debug errors.
|
||||
> - double-free;
|
||||
> - memory corruption and segfaults.
|
||||
|
||||
|
||||
32. On **Mac OS X** the `fcntl(F_FULLFSYNC)` syscall is used _by
|
||||
default_ to synchronize data with the disk, as this is [the only way to
|
||||
guarantee data
|
||||
durability](https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/fsync.2.html)
|
||||
in case of power failure. Unfortunately, in scenarios with high write
|
||||
intensity, the use of `F_FULLFSYNC` significant degrades performance
|
||||
compared to LMDB, where the `fsync()` syscall is used. Therefore,
|
||||
_libmdbx_ allows you to override this behavior by defining the
|
||||
`MDBX_OSX_SPEED_INSTEADOF_DURABILITY=1` option while build the library.
|
||||
|
||||
33. On **Windows** the `LockFileEx()` syscall is used for locking, since
|
||||
it allows place the database on network drives, and provides protection
|
||||
against incompetent user actions (aka
|
||||
[poka-yoke](https://en.wikipedia.org/wiki/Poka-yoke)). Therefore
|
||||
_libmdbx_ may be a little lag in performance tests from LMDB where a
|
||||
named mutexes are used.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
## Gotchas
|
||||
@ -360,7 +371,8 @@ to simplify this as follows:
|
||||
exhaust the free DB space.
|
||||
|
||||
* If the available space is exhausted, any attempt to update the data
|
||||
will cause a "MAP_FULL" error until a long read transaction is completed.
|
||||
will cause a "MAP_FULL" error until a long read transaction is
|
||||
completed.
|
||||
|
||||
* A good example of long readers is a hot backup or debugging of
|
||||
a client application while retaining an active read transaction.
|
||||
@ -373,14 +385,13 @@ operations and the `LIFO RECLAIM` mode which addresses performance
|
||||
degradation.
|
||||
|
||||
#### Durability in asynchronous writing mode
|
||||
In `WRITEMAP+MAPSYNC` mode updated (aka dirty) pages are written
|
||||
to persistent storage by the OS kernel. This means that if the
|
||||
application fails, the OS kernel will finish writing all updated
|
||||
data to disk and nothing will be lost.
|
||||
However, in the case of hardware malfunction or OS kernel fatal error,
|
||||
only some updated data can be written to disk and the database structure
|
||||
is likely to be destroyed.
|
||||
In such situation, DB is completely corrupted and can't be repaired.
|
||||
In `WRITEMAP+MAPSYNC` mode updated (aka dirty) pages are written to
|
||||
persistent storage by the OS kernel. This means that if the application
|
||||
fails, the OS kernel will finish writing all updated data to disk and
|
||||
nothing will be lost. However, in the case of hardware malfunction or OS
|
||||
kernel fatal error, only some updated data can be written to disk and
|
||||
the database structure is likely to be destroyed. In such situation, DB
|
||||
is completely corrupted and can't be repaired.
|
||||
|
||||
_libmdbx_ addresses this by fully reimplementing write path of data:
|
||||
|
||||
@ -406,17 +417,19 @@ which will cause returning the MDBX_WANNA_RECOVERY error.
|
||||
|
||||
For data integrity a pages which form database snapshot with steady
|
||||
commit point, must not be updated until next steady commit point.
|
||||
Therefore the last steady commit point creates an effect analogues to "long-time read".
|
||||
The only difference that now in case of space exhaustion the problem
|
||||
will be immediately addressed by writing changes to disk and forming
|
||||
the new steady commit point.
|
||||
Therefore the last steady commit point creates an effect analogues to
|
||||
"long-time read". The only difference that now in case of space
|
||||
exhaustion the problem will be immediately addressed by writing changes
|
||||
to disk and forming the new steady commit point.
|
||||
|
||||
So in async-write mode _libmdbx_ will always use new pages until the
|
||||
free DB space will be exhausted or `mdbx_env_sync()` will be invoked,
|
||||
and the total write traffic to the disk will be the same as in sync-write mode.
|
||||
and the total write traffic to the disk will be the same as in
|
||||
sync-write mode.
|
||||
|
||||
Currently libmdbx gives a choice between a safe async-write mode (default) and
|
||||
`UTTERLY_NOSYNC` mode which may lead to DB corruption after a system crash, i.e. like the LMDB.
|
||||
Currently libmdbx gives a choice between a safe async-write mode
|
||||
(default) and `UTTERLY_NOSYNC` mode which may lead to DB corruption
|
||||
after a system crash, i.e. like the LMDB.
|
||||
|
||||
Next version of _libmdbx_ will be automatically create steady commit
|
||||
points in async-write mode upon completion transfer data to the disk.
|
||||
|
Loading…
Reference in New Issue
Block a user