Compare commits

..

21 Commits

Author SHA1 Message Date
Leonid Yuriev
700f3514b3 mdbx: bump version to 0.6.x
Change-Id: I925ab0aaefb1a8f9860925c2e8e7c81015428b2e
2020-01-21 00:17:55 +03:00
Leonid Yuriev
2d334185cb mdbx-tools: rework/fix mdbx_load for custom comparators.
Change-Id: I9bc15fb878d1586839768f97567806208bfcc5b8
2020-01-21 00:17:55 +03:00
Leonid Yuriev
2c1d3a3fda mdbx: refine dbi_open_ex().
Change-Id: I32bc1c6609e14ba90b2f4eaf9b8b11ea06f2eb8b
2020-01-21 00:17:55 +03:00
Leonid Yuriev
7d880a37dd mdbx: refine internal sort.
Change-Id: If07d9f6b7a7976e5e048eb1b8b7e0b65c4ed3fdd
2020-01-21 00:17:55 +03:00
Leonid Yuriev
d12b546a7d mdbx: fix MDBX_APPEND check inside cursor_put().
Change-Id: If21dedbd72b3a038252b9dc10c5c6543328362e7
2020-01-21 00:17:55 +03:00
Leonid Yuriev
6184024a80 mdbx: more __has_builtin().
Change-Id: Ie23e170e12d96ad47bf2f25c7dde974673109b54
2020-01-21 00:17:55 +03:00
Leonid Yuriev
2bfcbe980e mdbx: refine/fix dbi_bind().
Change-Id: Ic4245c349870198f79efd537cf12d9bdf691b7ca
2020-01-21 00:17:55 +03:00
Leonid Yuriev
0710b07d7c mdbx: refine/speedup dpl_search().
Change-Id: I712e22ea69f23f61c92be976069f09a85831d086
2020-01-21 00:17:55 +03:00
Leonid Yuriev
7c894f0542 mdbx: HNY!
Change-Id: Idbd21263408f87ac2715675c9f7ccc6c44d41c9a
2020-01-21 00:17:55 +03:00
Leonid Yuriev
c05875befd mdbx: refine/speedup internal sort (10-30% faster).
- more friendly for Russian Elbrus's predicates (512-bit wide VLIW).
- more CMOV-friendly for x86 (nicely optimized by gcc-9.x and clang-8.x).
- use bitonic sort for small chunks.
- less branches in the outer loop.

Change-Id: I0510f5a0b2c39a19caa9e829a20e34dfbd160a01
2020-01-21 00:17:54 +03:00
Leonid Yuriev
20b09820c9 mdbx: minor update README.
Change-Id: I15edbc2572a57e80634347b272d354cda6cc13c4
2020-01-15 21:05:02 +03:00
Leonid Yuriev
1c4653d466 mdbx: update README (note about HyperThreading in read-scalability benchmark).
Change-Id: I03e49a9675ecf585a8e2df56cca9949dd9b5bccb
2020-01-09 19:10:35 +03:00
Leonid Yuriev
8cd7cfc65d mdbx-test: refine jitter testcase.
Change-Id: If1a3774da2b8b29249d81a54799117646820c036
2020-01-06 01:42:31 +03:00
Leonid Yuriev
995a26cf19 mdbx-windws: refine/fix handling STATUS_CONFLICTING_ADDRESSES.
Change-Id: I501acb2d5d653c74ab210907dd955d7167956af8
2020-01-06 01:23:11 +03:00
Leonid Yuriev
230e4654f1 mdbx-test: don't use MDBX_DBG_DUMP.
Change-Id: I10274a2037d0630b5ba5ea39a67a107c5615e4cd
2020-01-05 15:17:06 +03:00
Leonid Yuriev
297fe3885c mdbx: update README.
Change-Id: Ied776d508485f8cb1165a6fb98220672518b1e01
2020-01-05 00:49:16 +03:00
Leonid Yuriev
cda829b327 mdbx-tests: fix built-in help.
Change-Id: Ia4073e6394b48ceef7b032bd023d4d409efc7667
2020-01-05 00:49:16 +03:00
Leonid Yuriev
f282ae45e0 mdbx: more unlikely (minor).
Change-Id: I9052d89d4b297615af199a0e2f468cda1482297a
2020-01-05 00:49:16 +03:00
Leonid Yuriev
9de65acf3e mdbx: fix env_set_geometry() for large pagesize.
Change-Id: Ide12e705abf76184f839d1670b0ca1c1e1c64da5
2020-01-05 00:49:16 +03:00
Leonid Yuriev
1c4b80ec61 mdbx-test: output txn-size limit into test-log.
Change-Id: Ib4b7b5932df794879226e0d32c8a7e6b1d31d17f
2020-01-05 00:34:33 +03:00
Leonid Yuriev
f3a95fe26b mdbx: minor refine API description.
Change-Id: If5615ebff66fe6928d24d22e1300fdf59361527d
2020-01-05 00:34:31 +03:00
54 changed files with 1104 additions and 607 deletions

View File

@@ -10,7 +10,7 @@
##
##
## Copyright 2019 Leonid Yuriev <leo@yuriev.ru>
## Copyright 2020 Leonid Yuriev <leo@yuriev.ru>
## and other libmdbx authors: please see AUTHORS file.
## All rights reserved.
##

View File

@@ -1,5 +1,5 @@
##
## Copyright 2019 Leonid Yuriev <leo@yuriev.ru>
## Copyright 2020 Leonid Yuriev <leo@yuriev.ru>
## and other libmdbx authors: please see AUTHORS file.
## All rights reserved.
##

View File

@@ -1,4 +1,4 @@
Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
Copyright 2011-2015 Howard Chu, Symas Corp.
Copyright 2015,2016 Peter-Service R&D LLC.
All rights reserved.

496
README.md
View File

@@ -1,41 +1,71 @@
libmdbx
======================================
=======
_libmdbx_ is an extremely fast, compact, powerful, embedded
transactional [key-value
store](https://en.wikipedia.org/wiki/Key-value_database)
database, with permissive [OpenLDAP Public License](LICENSE).
_libmdbx_ has a specific set of properties and capabilities,
focused on creating unique lightweight solutions with
extraordinary performance.
_libmdbx_ is an extremely fast, compact, powerful, embedded,
transactional [key-value store](https://en.wikipedia.org/wiki/Key-value_database)
database, with [permissive license](LICENSE).
_MDBX_ has a specific set of properties and capabilities,
focused on creating unique lightweight solutions with extraordinary performance.
The next version is under active non-public development and will be
1. Allows **swarm of multi-threaded processes to [ACID]((https://en.wikipedia.org/wiki/ACID))ly read and update** several key-value [maps](https://en.wikipedia.org/wiki/Associative_array) and [multimaps](https://en.wikipedia.org/wiki/Multimap) in a localy-shared database.
2. Provides **extraordinary performance**, minimal overhead through [Memory-Mapping](https://en.wikipedia.org/wiki/Memory-mapped_file) and `Olog(N)` operations costs by virtue of [B+ tree](https://en.wikipedia.org/wiki/B%2B_tree).
3. Requires **no maintenance and no crash recovery** since doesn't use [WAL](https://en.wikipedia.org/wiki/Write-ahead_logging), but that might be a caveat for write-intensive workloads.
4. **Compact and friendly for fully embeddeding**. Only 25KLOC of `C11`, 64K x86 binary code,
no internal threads neither processes, but implements a simplified variant of the
[Berkeley DB](https://en.wikipedia.org/wiki/Berkeley_DB) and
[dbm](https://en.wikipedia.org/wiki/DBM_(computing)) API.
5. Enforces [serializability](https://en.wikipedia.org/wiki/Serializability) for
writers just by single
[mutex](https://en.wikipedia.org/wiki/Mutual_exclusion) and affords
[wait-free](https://en.wikipedia.org/wiki/Non-blocking_algorithm#Wait-freedom)
for parallel readers without atomic/interlocked operations, while
**writing and reading transactions do not block each other**.
6. **Guarantee data integrity** after crash unless this was explicitly
neglected in favour of write performance.
7. Supports Linux, Windows, MacOS, FreeBSD, DragonFly, Solaris,
OpenSolaris, OpenIndiana, NetBSD, OpenBSD and other systems compliant with
**POSIX.1-2008**.
Historically, _MDBX_ is deeply revised and extended descendant of amazing
[Lightning Memory-Mapped Database](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database).
_MDBX_ inherits all benefits from _LMDB_, but resolves some issues and adds set of improvements.
The next version is under active non-public development from scratch and will be
released as **_MithrilDB_** and `libmithrildb` for libraries & packages.
Admittedly mythical [Mithril](https://en.wikipedia.org/wiki/Mithril) is
resembling silver but being stronger and lighter than steel. Therefore
_MithrilDB_ is rightly relevant name.
> _MithrilDB_ will be radically different from _libmdbx_ by the new
> database format and API based on C++17, as well as the [Apache 2.0
> License](https://www.apache.org/licenses/LICENSE-2.0). The goal of this
> revolution is to provide a clearer and robust API, add more features and
> new valuable properties of database.
*The Future will (be) [Positive](https://www.ptsecurity.com). Всё будет хорошо.*
> _MithrilDB_ will be radically different from _libmdbx_ by the new
> database format and API based on C++17, as well as the [Apache 2.0
> License](https://www.apache.org/licenses/LICENSE-2.0). The goal of this
> revolution is to provide a clearer and robust API, add more features and
> new valuable properties of database.
[![Build Status](https://travis-ci.org/leo-yuriev/libmdbx.svg?branch=master)](https://travis-ci.org/leo-yuriev/libmdbx)
[![Build status](https://ci.appveyor.com/api/projects/status/ue94mlopn50dqiqg/branch/master?svg=true)](https://ci.appveyor.com/project/leo-yuriev/libmdbx/branch/master)
[![Coverity Scan Status](https://scan.coverity.com/projects/12915/badge.svg)](https://scan.coverity.com/projects/reopen-libmdbx)
*The Future will (be) [Positive](https://www.ptsecurity.com). Всё будет хорошо.*
-----
## Table of Contents
- [Overview](#overview)
- [Features](#features)
- [Limitations](#limitations)
- [Caveats & Gotchas](#caveats--gotchas)
- [Comparison with other databases](#comparison-with-other-databases)
- [Improvements beyond LMDB](#improvements-beyond-lmdb)
- [History & Acknowledgments](#history)
- [Description](#description)
- [Key features](#key-features)
- [Improvements over LMDB](#improvements-over-lmdb)
- [Gotchas](#gotchas)
- [Usage](#usage)
- [Building](#building)
- [API description](#api-description)
- [Bindings](#bindings)
- [Performance comparison](#performance-comparison)
- [Integral performance](#integral-performance)
@@ -45,51 +75,195 @@ _MithrilDB_ is rightly relevant name.
- [Async-write mode](#async-write-mode)
- [Cost comparison](#cost-comparison)
-----
# Overview
## Overview
## Features
_libmdbx_ is revised and extended descendant of amazing [Lightning
Memory-Mapped
Database](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database).
_libmdbx_ inherits all features and characteristics from
[LMDB](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database),
but resolves some issues and adds several features.
- Key-value data model, keys are always sorted.
- _libmdbx_ guarantee data integrity after crash unless this was explicitly
neglected in favour of write performance.
- Fully [ACID](https://en.wikipedia.org/wiki/ACID)-compliant, through to
[MVCC](https://en.wikipedia.org/wiki/Multiversion_concurrency_control)
and [CoW](https://en.wikipedia.org/wiki/Copy-on-write).
- _libmdbx_ allows multiple processes to read and update several key-value
tables concurrently, while being
[ACID](https://en.wikipedia.org/wiki/ACID)-compliant, with minimal
overhead and Olog(N) operation cost.
- Multiple key-value sub-databases within a single datafile.
- _libmdbx_ enforce
[serializability](https://en.wikipedia.org/wiki/Serializability) for
writers by single
[mutex](https://en.wikipedia.org/wiki/Mutual_exclusion) and affords
[wait-free](https://en.wikipedia.org/wiki/Non-blocking_algorithm#Wait-freedom)
for parallel readers without atomic/interlocked operations, while
writing and reading transactions do not block each other.
- Range lookups, including range query estimation.
- _libmdbx_ uses [B+Trees](https://en.wikipedia.org/wiki/B%2B_tree) and
[Memory-Mapping](https://en.wikipedia.org/wiki/Memory-mapped_file),
doesn't use [WAL](https://en.wikipedia.org/wiki/Write-ahead_logging)
which might be a caveat for some workloads.
- Efficient support for short fixed length keys, including native 32/64-bit integers.
- _libmdbx_ implements a simplified variant of the [Berkeley
DB](https://en.wikipedia.org/wiki/Berkeley_DB) and/or
[dbm](https://en.wikipedia.org/wiki/DBM_(computing)) API.
- Ultra-efficient support for [multimaps](https://en.wikipedia.org/wiki/Multimap). Multi-values sorted, searchable and iterable. Keys stored without duplication.
- _libmdbx_ supports Linux, Windows, MacOS, FreeBSD, DragonFly, Solaris,
OpenSolaris, OpenIndiana, NetBSD, OpenBSD and other systems compliant with
POSIX.1-2008.
- Data is [memory-mapped](https://en.wikipedia.org/wiki/Memory-mapped_file) and accessible directly/zero-copy. Traversal of database records is extremely-fast.
- Transactions for readers and writers, ones do not block others.
- Writes are strongly serialized. No transactions conflicts nor deadlocks.
- Readers are [non-blocking](https://en.wikipedia.org/wiki/Non-blocking_algorithm), notwithstanding [snapshot isolation](https://en.wikipedia.org/wiki/Snapshot_isolation).
- Nested write transactions.
- Reads scales linearly across CPUs.
- Continuous zero-overhead database compactification.
- Automatic on-the-fly database size adjustment.
- Customizable database page size.
- `Olog(N)` cost of lookup, insert, update, and delete operations by virtue of [B+ tree characteristics](https://en.wikipedia.org/wiki/B%2B_tree#Characteristics).
- Online hot backup.
- Append operation for efficient bulk insertion of pre-sorted data.
- No [WAL](https://en.wikipedia.org/wiki/Write-ahead_logging) nor any
transaction journal. No crash recovery needed. No maintenance is required.
- No internal cache and/or memory management, all done by basic OS services.
## Limitations
- **Page size**: a power of 2, maximum `65536` bytes, default `4096` bytes.
- **Key size**: minimum 0, maximum ≈¼ pagesize (`1300` bytes for default 4K pagesize, `21780` bytes for 64K pagesize).
- **Value size**: minimum 0, maximum `2146435072` (`0x7FF00000`) bytes for maps, ≈¼ pagesize for multimaps (`1348` bytes default 4K pagesize, `21828` bytes for 64K pagesize).
- **Write transaction size**: up to `4194301` (`0x3FFFFD`) pages (16 [GiB](https://en.wikipedia.org/wiki/Gibibyte) for default 4K pagesize, 256 [GiB](https://en.wikipedia.org/wiki/Gibibyte) for 64K pagesize).
- **Database size**: up to `2147483648` pages (8 [TiB](https://en.wikipedia.org/wiki/Tebibyte) for default 4K pagesize, 128 [TiB](https://en.wikipedia.org/wiki/Tebibyte) for 64K pagesize).
- **Maximum sub-databases**: `32765`.
## Caveats & Gotchas
1. There cannot be more than one writer at a time, i.e. no more than one write transaction at a time.
2. MDBX is based on [B+ tree](https://en.wikipedia.org/wiki/B%2B_tree), so access to database pages is mostly random.
Thus SSDs provide a significant performance boost over spinning disks for large databases.
3. MDBX uses [shadow paging](https://en.wikipedia.org/wiki/Shadow_paging) instead of [WAL](https://en.wikipedia.org/wiki/Write-ahead_logging). Thus syncing data to disk might be bottleneck for write intensive workload.
4. MDBX uses [copy-on-write](https://en.wikipedia.org/wiki/Copy-on-write) for [snapshot isolation](https://en.wikipedia.org/wiki/Snapshot_isolation) during updates, but read transactions prevents recycling an old retired/freed pages, since it read ones. Thus altering of data during a parallel
long-lived read operation will increase the process work set, may exhaust entire free database space,
the database can grow quickly, and result in performance degradation.
Try to avoid long running read transactions.
5. MDBX is extraordinarily fast and provides minimal overhead for data access,
so you should reconsider about use brute force techniques and double check your code.
On the one hand, in the case of MDBX, a simple linear search may be more profitable than complex indexes.
On the other hand, if you make something suboptimally, you can notice a detrimentally only on sufficiently large data.
### Comparison with other databases
For now please refer to [chapter of "BoltDB comparison with other
databases"](https://github.com/coreos/bbolt#comparison-with-other-databases)
which is also (mostly) applicable to _libmdbx_.
Improvements beyond LMDB
========================
_libmdbx_ is superior to legendary _[LMDB](https://symas.com/lmdb/)_ in
terms of features and reliability, not inferior in performance. In
comparison to _LMDB_, _libmdbx_ make things "just work" perfectly and
out-of-the-box, not silently and catastrophically break down. The list
below is pruned down to the improvements most notable and obvious from
the user's point of view.
### Added Features:
1. Keys could be more than 2 times longer than _LMDB_.
> For DB with default page size _libmdbx_ support keys up to 1300 bytes
> and up to 21780 bytes for 64K page size. _LMDB_ allows key size up to
> 511 bytes and may silently loses data with large values.
2. Up to 20% faster than _LMDB_ in [CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete) benchmarks.
> Benchmarks of the in-[tmpfs](https://en.wikipedia.org/wiki/Tmpfs) scenarios,
> that tests the speed of engine itself, shown that _libmdbx_ 10-20% faster than _LMDB_.
> These and other results could be easily reproduced with [ioArena](https://github.com/pmwkaa/ioarena) just by `make bench-quartet`,
> including comparisons with [RockDB](https://en.wikipedia.org/wiki/RocksDB)
> and [WiredTiger](https://en.wikipedia.org/wiki/WiredTiger).
3. Automatic on-the-fly database size adjustment, both increment and reduction.
> _libmdbx_ manage the database size according to parameters specified
> by `mdbx_env_set_geometry()` function,
> ones include the growth step and the truncation threshold.
4. Automatic continuous zero-overhead database compactification.
> During each commit _libmdbx_ merges suitable freeing pages into unallocated area
> at the end of file, and then truncate unused space when a lot enough of.
5. The same database format for 32- and 64-bit builds.
> _libmdbx_ database format depends only on the [endianness](https://en.wikipedia.org/wiki/Endianness) but not on the [bitness](https://en.wiktionary.org/wiki/bitness).
6. LIFO policy for Garbage Collection recycling. This can significantly increase write performance due write-back disk cache up to several times in a best case scenario.
> LIFO means that for reuse will be taken latest became unused pages.
> Therefore the loop of database pages circulation becomes as short as possible.
> In other words, the set of pages, that are (over)written in memory and on disk during a series of write transactions, will be as small as possible.
> Thus creates ideal conditions for the battery-backed or flash-backed disk cache efficiency.
7. Fast estimation of range query result volume, i.e. how many items can
be found between a `KEY1` and a `KEY2`. This is prerequisite for build
and/or optimize query execution plans.
> _libmdbx_ performs a rough estimate based on common B-tree pages of the paths from root to corresponding keys.
8. `mdbx_chk` tool for database integrity check.
9. Automated steady sync-to-disk upon several thresholds and/or timeout via cheap polling.
10. Sequence generation and three persistent 64-bit markers.
11. Callback for lack-of-space condition of database that allows you to control and/or resolve such situations.
12. Support for opening database in the exclusive mode, including on a network share.
### Added Abilities:
1. Zero-length for keys and values.
2. Ability to determine whether the particular data is on a dirty page
or not, that allows to avoid copy-out before updates.
3. Ability to determine whether the cursor is pointed to a key-value
pair, to the first, to the last, or not set to anything.
4. Extended information of whole-database, sub-databases, transactions, readers enumeration.
> _libmdbx_ provides a lot of information, including dirty and leftover pages
> for a write transaction, reading lag and holdover space for read transactions.
5. Extended update and delete operations.
> _libmdbx_ allows ones _at once_ with getting previous value
> and addressing the particular item from multi-value with the same key.
### Other fixes and specifics:
1. Fixed more than 10 significant errors, in particular: page leaks, wrong sub-database statistics, segfault in several conditions, unoptimal page merge strategy, updating an existing record with a change in data size (including for multimap), etc.
2. All cursors can be reused and should be closed explicitly, regardless ones were opened within write or read transaction.
3. Opening database handles are spared from race conditions and
pre-opening is not needed.
4. Returning `MDBX_EMULTIVAL` error in case of ambiguous update or delete.
5. Guarantee of database integrity even in asynchronous unordered write-to-disk mode.
> _libmdbx_ propose additional trade-off by implementing append-like manner for updates
> in `MDBX_SAFE_NOSYNC` and `MDBX_WRITEMAP|MDBX_MAPASYNC` modes, that avoid database corruption after a system crash
> contrary to LMDB. Nevertheless, the `MDBX_UTTERLY_NOSYNC` mode available to match LMDB behaviour,
> and for a special use-cases.
6. On **MacOS** the `fcntl(F_FULLFSYNC)` syscall is used _by
default_ to synchronize data with the disk, as this is [the only way to
guarantee data
durability](https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/fsync.2.html)
in case of power failure. Unfortunately, in scenarios with high write
intensity, the use of `F_FULLFSYNC` significant degrades performance
compared to LMDB, where the `fsync()` syscall is used. Therefore,
_libmdbx_ allows you to override this behavior by defining the
`MDBX_OSX_SPEED_INSTEADOF_DURABILITY=1` option while build the library.
7. On **Windows** the `LockFileEx()` syscall is used for locking, since
it allows place the database on network drives, and provides protection
against incompetent user actions (aka
[poka-yoke](https://en.wikipedia.org/wiki/Poka-yoke)). Therefore
_libmdbx_ may be a little lag in performance tests from LMDB where a
named mutexes are used.
### History
At first the development was carried out within the
[ReOpenLDAP](https://github.com/leo-yuriev/ReOpenLDAP) project. About a
@@ -107,215 +281,6 @@ originated the MDBX in 2015.
Martin Hedenfalk <martin@bzero.se> is the author of `btree.c` code, which
was used for begin development of LMDB.
-----
Description
===========
## Key features
1. Key-value pairs are stored in ordered map(s), keys are always sorted,
range lookups are supported.
2. Data is [memory-mapped](https://en.wikipedia.org/wiki/Memory-mapped_file)
into each worker DB process, and could be accessed zero-copy from transactions.
3. Transactions are
[ACID](https://en.wikipedia.org/wiki/ACID)-compliant, through to
[MVCC](https://en.wikipedia.org/wiki/Multiversion_concurrency_control)
and [CoW](https://en.wikipedia.org/wiki/Copy-on-write). Writes are
strongly serialized and aren't blocked by reads, transactions can't
conflict with each other. Reads are guaranteed to get only commited data
([relaxing serializability](https://en.wikipedia.org/wiki/Serializability#Relaxing_serializability)).
4. Read transactions are
[non-blocking](https://en.wikipedia.org/wiki/Non-blocking_algorithm),
don't use [atomic operations](https://en.wikipedia.org/wiki/Linearizability#High-level_atomic_operations).
Readers don't block each other and aren't blocked by writers. Read
performance scales linearly with CPU core count.
> Nonetheless, "connect to DB" (starting the first read transaction in a thread) and
> "disconnect from DB" (closing DB or thread termination) requires a lock
> acquisition to register/unregister at the "readers table".
5. Keys with multiple values are stored efficiently without key
duplication, sorted by value, including integers (valuable for
secondary indexes).
6. Efficient operation on short fixed length keys,
including 32/64-bit integer types.
7. [WAF](https://en.wikipedia.org/wiki/Write_amplification) (Write
Amplification Factor) и RAF (Read Amplification Factor) are Olog(N).
8. No [WAL](https://en.wikipedia.org/wiki/Write-ahead_logging) and
transaction journal. In case of a crash no recovery needed. No need for
regular maintenance. Backups can be made on the fly on working DB
without freezing writers.
9. No additional memory management, all done by basic OS services.
## Improvements over LMDB
_libmdbx_ is superior to legendary _[LMDB](https://symas.com/lmdb/)_ in
terms of features and reliability, not inferior in performance. In
comparison to _LMDB_, _libmdbx_ make things "just work" perfectly and
out-of-the-box, not silently and catastrophically break down. The list
below is pruned down to the improvements most notable and obvious from
the user's point of view.
1. Larger limit for keys size. More than 2 larger than _LMDB_.
> For DB with default page size _libmdbx_ support keys up to 1300 bytes
> and up to 21780 bytes for 64K page size. _LMDB_ allows key size up to
> 511 bytes and may silently loses data with large values.
2. Up to 20% faster than _LMDB_ in [CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete) benchmarks.
> Benchmarks of the in-[tmpfs](https://en.wikipedia.org/wiki/Tmpfs) scenarios,
> that tests the speed of engine itself, shown that _libmdbx_ 10-20% faster than _LMDB_.
> These and other results could be easily reproduced with [ioArena](https://github.com/pmwkaa/ioarena) just by `make bench-quartet`,
> including comparisons with [RockDB](https://en.wikipedia.org/wiki/RocksDB)
> and [WiredTiger](https://en.wikipedia.org/wiki/WiredTiger).
3. Automatic on-the-fly database size control by preset parameters, both
reduction and increment.
> _libmdbx_ manage the database size according to parameters specified
> by `mdbx_env_set_geometry()` function,
> ones include the growth step and the truncation threshold.
4. Automatic continuous zero-overhead database compactification.
> _libmdbx_ logically move as possible a freed pages
> at end of allocation area into unallocated space,
> and then release such space if a lot of.
5. LIFO policy for recycling a Garbage Collection items. On systems with a disk
write-back cache, this can significantly increase write performance, up to
several times in a best case scenario.
> LIFO means that for reuse pages will be taken which became unused the lastest.
> Therefore the loop of database pages circulation becomes as short as possible.
> In other words, the number of pages, that are overwritten in memory
> and on disk during a series of write transactions, will be as small as possible.
> Thus creates ideal conditions for the efficient operation of the disk write-back cache.
6. Fast estimation of range query result volume, i.e. how many items can
be found between a `KEY1` and a `KEY2`. This is prerequisite for build
and/or optimize query execution plans.
> _libmdbx_ performs a rough estimate based only on b-tree pages that
> are common for the both stacks of cursors that were set to corresponing
> keys.
7. `mdbx_chk` tool for database integrity check.
8. Guarantee of database integrity even in asynchronous unordered write-to-disk mode.
> _libmdbx_ propose additional trade-off by implementing append-like manner for updates
> in `NOSYNC` and `MAPASYNC` modes, that avoid database corruption after a system crash
> contrary to LMDB. Nevertheless, the `MDBX_UTTERLY_NOSYNC` mode available to match LMDB behaviour,
> and for a special use-cases.
9. Automated steady flush to disk upon volume of changes and/or by
timeout via cheap polling.
10. Sequence generation and three cheap persistent 64-bit markers with ACID.
11. Support for keys and values of zero length, including multi-values
(aka sorted duplicates).
12. The handler of lack-of-space condition with a callback,
that allow you to control and resolve such situations.
13. Support for opening a database in the exclusive mode, including on a network share.
14. Extended transaction info, including dirty and leftover space info
for a write transaction, reading lag and hold over space for read
transactions.
15. Extended whole-database info (aka environment) and reader enumeration.
16. Extended update or delete, _at once_ with getting previous value
and addressing the particular item from multi-value with the same key.
17. Support for explicitly updating the existing record, not insertion a new one.
18. All cursors are uniformly, can be reused and should be closed explicitly,
regardless ones were opened within write or read transaction.
19. Correct update of current record with `MDBX_CURRENT` flag when size
of key or data was changed, including sorted duplicated.
20. Opening database handles is spared from race conditions and
pre-opening is not needed.
21. Ability to determine whether the particular data is on a dirty page
or not, that allows to avoid copy-out before updates.
22. Ability to determine whether the cursor is pointed to a key-value
pair, to the first, to the last, or not set to anything.
23. Returning `MDBX_EMULTIVAL` error in case of ambiguous update or delete.
24. On **MacOS** the `fcntl(F_FULLFSYNC)` syscall is used _by
default_ to synchronize data with the disk, as this is [the only way to
guarantee data
durability](https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/fsync.2.html)
in case of power failure. Unfortunately, in scenarios with high write
intensity, the use of `F_FULLFSYNC` significant degrades performance
compared to LMDB, where the `fsync()` syscall is used. Therefore,
_libmdbx_ allows you to override this behavior by defining the
`MDBX_OSX_SPEED_INSTEADOF_DURABILITY=1` option while build the library.
25. On **Windows** the `LockFileEx()` syscall is used for locking, since
it allows place the database on network drives, and provides protection
against incompetent user actions (aka
[poka-yoke](https://en.wikipedia.org/wiki/Poka-yoke)). Therefore
_libmdbx_ may be a little lag in performance tests from LMDB where a
named mutexes are used.
## Gotchas
1. There cannot be more than one writer at a time.
> On the other hand, this allows serialize an updates and eliminate any
> possibility of conflicts, deadlocks or logical errors.
2. No [WAL](https://en.wikipedia.org/wiki/Write-ahead_logging) means
relatively big [WAF](https://en.wikipedia.org/wiki/Write_amplification)
(Write Amplification Factor). Because of this syncing data to disk might
be quite resource intensive and be main performance bottleneck during
intensive write workload.
> As compromise _libmdbx_ allows several modes of lazy and/or periodic
> syncing, including `MAPASYNC` mode, which modificate data in memory and
> asynchronously syncs data to disk, moment to sync is picked by OS.
>
> Although this should be used with care, synchronous transactions in a DB
> with transaction journal will require 2 IOPS minimum (probably 3-4 in
> practice) because of filesystem overhead, overhead depends on
> filesystem, not on record count or record size. In _libmdbx_ IOPS count
> will grow logarithmically depending on record count in DB (height of B+
> tree) and will require at least 2 IOPS per transaction too.
3. [CoW](https://en.wikipedia.org/wiki/Copy-on-write) for
[MVCC](https://en.wikipedia.org/wiki/Multiversion_concurrency_control)
is done on memory page level with
[B+trees](https://ru.wikipedia.org/wiki/B-%D0%B4%D0%B5%D1%80%D0%B5%D0%B2%D0%BE).
Therefore altering data requires to copy about Olog(N) memory pages,
which uses [memory bandwidth](https://en.wikipedia.org/wiki/Memory_bandwidth) and is main
performance bottleneck in `MDBX_MAPASYNC` mode.
> This is unavoidable, but isn't that bad. Syncing data to disk requires
> much more similar operations which will be done by OS, therefore this is
> noticeable only if data sync to persistent storage is fully disabled.
> _libmdbx_ allows to safely save data to persistent storage with minimal
> performance overhead. If there is no need to save data to persistent
> storage then it's much more preferable to use `std::map`.
4. Massive altering of data during a parallel long read operation will
increase the process work set, may exhaust entire free database space and
result in subsequent write performance degradation.
> _libmdbx_ mostly solve this issue by lack-of-space callback and `MDBX_LIFORECLAIM` mode.
> See [`mdbx.h`](mdbx.h) with API description for details.
> The "next" version of libmdbx (MithrilDB) will completely solve this.
5. There are no built-in checksums or digests to verify database integrity.
> The "next" version of _libmdbx_ (MithrilDB) will solve this issue employing [Merkle Tree](https://en.wikipedia.org/wiki/Merkle_tree).
--------------------------------------------------------------------------------
Usage
@@ -435,6 +400,9 @@ will need to install the current (not outdated) version of
recommend that you install [Homebrew](https://brew.sh/) and then execute
`brew install bash`.
## API description
For more information and API description see the [mdbx.h](mdbx.h) header.
## Bindings
| Runtime | GitHub | Author |
@@ -450,14 +418,14 @@ Performance comparison
All benchmarks were done in 2015 by [IOArena](https://github.com/pmwkaa/ioarena)
and multiple [scripts](https://github.com/pmwkaa/ioarena/tree/HL%2B%2B2015)
runs on Lenovo Carbon-2 laptop, i7-4600U 2.1 GHz, 8 Gb RAM,
runs on Lenovo Carbon-2 laptop, i7-4600U 2.1 GHz (2 physical cores, 4 HyperThreading cores), 8 Gb RAM,
SSD SAMSUNG MZNTD512HAGL-000L1 (DXT23L0Q) 512 Gb.
## Integral performance
Here showed sum of performance metrics in 3 benchmarks:
- Read/Search on 4 CPU cores machine;
- Read/Search on machine with 4 logical CPU in HyperThreading mode (i.e. actually 2 physical CPU cores);
- Transactions with [CRUD](https://en.wikipedia.org/wiki/CRUD)
operations in sync-write mode (fdatasync is called after each
@@ -565,7 +533,7 @@ and after full run the database contains 10,000 small key-value records.
Summary of used resources during lazy-write mode benchmarks:
- Read and write IOPS;
- Read and write IOPs;
- Sum of user CPU time and sys CPU time;
@@ -574,7 +542,7 @@ Summary of used resources during lazy-write mode benchmarks:
compactification, etc).
_ForestDB_ is excluded because benchmark showed it's resource
consumption for each resource (CPU, IOPS) much higher than other engines
consumption for each resource (CPU, IOPs) much higher than other engines
which prevents to meaningfully compare it with them.
All benchmark data is gathered by

View File

@@ -1,4 +1,4 @@
version: 0.5.0.{build}
version: 0.6.0.{build}
environment:
matrix:

View File

@@ -1,4 +1,4 @@
## Copyright (c) 2012-2019 Leonid Yuriev <leo@yuriev.ru>.
## Copyright (c) 2012-2020 Leonid Yuriev <leo@yuriev.ru>.
##
## Licensed under the Apache License, Version 2.0 (the "License");
## you may not use this file except in compliance with the License.

View File

@@ -1,4 +1,4 @@
## Copyright (c) 2012-2019 Leonid Yuriev <leo@yuriev.ru>.
## Copyright (c) 2012-2020 Leonid Yuriev <leo@yuriev.ru>.
##
## Licensed under the Apache License, Version 2.0 (the "License");
## you may not use this file except in compliance with the License.

View File

@@ -1,4 +1,4 @@
## Copyright (c) 2012-2019 Leonid Yuriev <leo@yuriev.ru>.
## Copyright (c) 2012-2020 Leonid Yuriev <leo@yuriev.ru>.
##
## Licensed under the Apache License, Version 2.0 (the "License");
## you may not use this file except in compliance with the License.

View File

@@ -4,7 +4,7 @@
*/
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
* Copyright 2017 Ilya Shipitsin <chipitsine@gmail.com>.
* Copyright 2012-2015 Howard Chu, Symas Corp.
* All rights reserved.

View File

@@ -4,7 +4,7 @@
*/
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
* Copyright 2012-2015 Howard Chu, Symas Corp.
* Copyright 2015,2016 Peter-Service R&D LLC.
* All rights reserved.

23
mdbx.h
View File

@@ -55,10 +55,10 @@
* transaction logs or append-only data writes, MDBX requires no maintenance
* during operation. Both write-ahead loggers and append-only databases require
* periodic checkpointing and/or compaction of their log or database files
* otherwise they grow without bound. MDBX tracks free pages within the database
* and re-uses them for new write operations, so the database size does not grow
* without bound in normal use. It is worth noting that the "next" version
* libmdbx (MithrilDB) will solve this problem.
* otherwise they grow without bound. MDBX tracks retired/freed pages within the
* database and re-uses them for new write operations, so the database size does
* not grow without bound in normal use. It is worth noting that the "next"
* version libmdbx (MithrilDB) will solve this problem.
*
* The memory map can be used as a read-only or read-write map. It is read-only
* by default as this provides total immunity to corruption. Using read-write
@@ -403,17 +403,16 @@
* the lock was restored - we have to wait until such a process releases the
* database, and so on.
*
* - Avoid long-lived transactions, especially in the scenarios with a high
* rate of write transactions. Read transactions prevent reuse of pages
* freed by newer write transactions, thus the database can grow quickly.
* Write transactions prevent other write transactions, since writes are
* serialized.
* - Avoid long-lived read transactions, especially in the scenarios with a
* high rate of write transactions. Long-lived read transactions prevents
* recycling pages retired/freed by newer write transactions, thus the
* database can grow quickly.
*
* Understanding the problem of long-lived read transactions requires some
* explanation, but can be difficult for quick perception. So is is
* reasonable to simplify this as follows:
* 1. Garbage collection problem exists in all databases one way or
* another, e.g. VACUUM in PostgreSQL. But in _libmdbx_ it's even more
* another, e.g. VACUUM in PostgreSQL. But in MDBX it's even more
* discernible because of high transaction rate and intentional
* internals simplification in favor of performance.
*
@@ -461,7 +460,7 @@
*
**** LICENSE AND COPYRUSTING **************************************************
*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -639,7 +638,7 @@ typedef pthread_t mdbx_tid_t;
/*----------------------------------------------------------------------------*/
#define MDBX_VERSION_MAJOR 0
#define MDBX_VERSION_MINOR 5
#define MDBX_VERSION_MINOR 6
#ifndef LIBMDBX_API
#if defined(LIBMDBX_EXPORTS)

View File

@@ -1,5 +1,5 @@
##
## Copyright 2019 Leonid Yuriev <leo@yuriev.ru>
## Copyright 2020 Leonid Yuriev <leo@yuriev.ru>
## and other libmdbx authors: please see AUTHORS file.
## All rights reserved.
##

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -1465,16 +1465,535 @@ static int lcklist_detach_locked(MDBX_env *env) {
}
/*------------------------------------------------------------------------------
* LY: State of the art quicksort-based sorting, with internal stack and
* shell-insertion-sort for small chunks (less than half of SORT_THRESHOLD).
*/
* LY: State of the art quicksort-based sorting, with internal stack
* and bitonic-sort for small chunks. */
/* LY: Large threshold give some boost due less overhead in the inner qsort
* loops, but also a penalty in cases reverse-sorted data.
* So, 42 is magically but reasonable:
* - 0-3% faster than std::sort (from GNU C++ STL 2018) in most cases.
* - slower by a few ticks in a few cases for sequences shorter than 21. */
#define SORT_THRESHOLD 42
#define SORT_CMP_SWAP(TYPE, CMP, a, b) \
do { \
const TYPE swap_tmp = (a); \
const bool swap_cmp = CMP(swap_tmp, b); \
(a) = swap_cmp ? swap_tmp : b; \
(b) = swap_cmp ? b : swap_tmp; \
} while (0)
#define SORT_BITONIC_2(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
} while (0)
#define SORT_BITONIC_3(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
} while (0)
#define SORT_BITONIC_4(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
} while (0)
#define SORT_BITONIC_5(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
} while (0)
#define SORT_BITONIC_6(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
} while (0)
#define SORT_BITONIC_7(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
} while (0)
#define SORT_BITONIC_8(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
} while (0)
#define SORT_BITONIC_9(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
} while (0)
#define SORT_BITONIC_10(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
} while (0)
#define SORT_BITONIC_11(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
} while (0)
#define SORT_BITONIC_12(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
} while (0)
#define SORT_BITONIC_13(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
} while (0)
#define SORT_BITONIC_14(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[12], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
} while (0)
#define SORT_BITONIC_15(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[12], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[12], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[13], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
} while (0)
#define SORT_BITONIC_16(TYPE, CMP, begin) \
do { \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[1]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[12], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[14], begin[15]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[12], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[3]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[13], begin[15]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[15]); \
SORT_CMP_SWAP(TYPE, CMP, begin[0], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[15]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[13], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[11]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[2]); \
SORT_CMP_SWAP(TYPE, CMP, begin[4], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[1], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[14]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[2], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[13]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[10], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[5]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[9]); \
SORT_CMP_SWAP(TYPE, CMP, begin[3], begin[4]); \
SORT_CMP_SWAP(TYPE, CMP, begin[5], begin[6]); \
SORT_CMP_SWAP(TYPE, CMP, begin[7], begin[8]); \
SORT_CMP_SWAP(TYPE, CMP, begin[9], begin[10]); \
SORT_CMP_SWAP(TYPE, CMP, begin[11], begin[12]); \
SORT_CMP_SWAP(TYPE, CMP, begin[6], begin[7]); \
SORT_CMP_SWAP(TYPE, CMP, begin[8], begin[9]); \
} while (0)
#define SORT_INNER(TYPE, CMP, begin, end, len) \
switch (len) { \
default: \
__unreachable(); \
case 0: \
case 1: \
break; \
case 2: \
SORT_BITONIC_2(TYPE, CMP, begin); \
break; \
case 3: \
SORT_BITONIC_3(TYPE, CMP, begin); \
break; \
case 4: \
SORT_BITONIC_4(TYPE, CMP, begin); \
break; \
case 5: \
SORT_BITONIC_5(TYPE, CMP, begin); \
break; \
case 6: \
SORT_BITONIC_6(TYPE, CMP, begin); \
break; \
case 7: \
SORT_BITONIC_7(TYPE, CMP, begin); \
break; \
case 8: \
SORT_BITONIC_8(TYPE, CMP, begin); \
break; \
case 9: \
SORT_BITONIC_9(TYPE, CMP, begin); \
break; \
case 10: \
SORT_BITONIC_10(TYPE, CMP, begin); \
break; \
case 11: \
SORT_BITONIC_11(TYPE, CMP, begin); \
break; \
case 12: \
SORT_BITONIC_12(TYPE, CMP, begin); \
break; \
case 13: \
SORT_BITONIC_13(TYPE, CMP, begin); \
break; \
case 14: \
SORT_BITONIC_14(TYPE, CMP, begin); \
break; \
case 15: \
SORT_BITONIC_15(TYPE, CMP, begin); \
break; \
case 16: \
SORT_BITONIC_16(TYPE, CMP, begin); \
break; \
}
#define SORT_SWAP(TYPE, a, b) \
do { \
@@ -1483,19 +2002,6 @@ static int lcklist_detach_locked(MDBX_env *env) {
(b) = swap_tmp; \
} while (0)
#define SORT_SHELLPASS(TYPE, CMP, begin, end, gap) \
for (TYPE *i = begin + gap; i < end; ++i) { \
for (TYPE *j = i - (gap); j >= begin && CMP(*i, *j); j -= gap) { \
const TYPE tmp = *i; \
do { \
j[gap] = *j; \
j -= gap; \
} while (j >= begin && CMP(tmp, *j)); \
j[gap] = tmp; \
break; \
} \
}
#define SORT_PUSH(low, high) \
do { \
top->lo = (low); \
@@ -1510,76 +2016,70 @@ static int lcklist_detach_locked(MDBX_env *env) {
high = top->hi; \
} while (0)
#define SORT_IMPL(NAME, TYPE, CMP) \
#define SORT_IMPL(NAME, EXPECT_LOW_CARDINALITY_OR_PRESORTED, TYPE, CMP) \
\
static __inline bool NAME##_is_sorted(const TYPE *first, const TYPE *last) { \
while (++first <= last) \
if (CMP(first[0], first[-1])) \
return false; \
return true; \
} \
\
typedef struct { \
TYPE *lo, *hi; \
} NAME##_stack; \
\
static __hot void NAME(TYPE *const begin, TYPE *const end) { \
const ptrdiff_t length = end - begin; \
if (length < 2) \
return; \
NAME##_stack stack[sizeof(unsigned) * CHAR_BIT], *top = stack; \
\
if (length > SORT_THRESHOLD / 2) { \
NAME##_stack stack[sizeof(unsigned) * CHAR_BIT], *top = stack; \
TYPE *hi = end - 1; \
TYPE *lo = begin; \
while (true) { \
const ptrdiff_t len = hi - lo; \
if (len < 16) { \
SORT_INNER(TYPE, CMP, lo, hi + 1, len + 1); \
if (unlikely(top == stack)) \
break; \
SORT_POP(lo, hi); \
continue; \
} \
\
TYPE *hi = end - 1; \
TYPE *lo = begin; \
while (true) { \
TYPE *mid = lo + ((hi - lo) >> 1); \
if (CMP(*mid, *lo)) \
SORT_SWAP(TYPE, *mid, *lo); \
if (CMP(*hi, *mid)) { \
SORT_SWAP(TYPE, *hi, *mid); \
if (CMP(*mid, *lo)) \
SORT_SWAP(TYPE, *mid, *lo); \
} \
TYPE *mid = lo + (len >> 1); \
SORT_CMP_SWAP(TYPE, CMP, *lo, *mid); \
SORT_CMP_SWAP(TYPE, CMP, *mid, *hi); \
SORT_CMP_SWAP(TYPE, CMP, *lo, *mid); \
\
TYPE *right = hi - 1; \
TYPE *left = lo + 1; \
do { \
while (CMP(*mid, *right)) \
--right; \
while (CMP(*left, *mid)) \
++left; \
if (left < right) { \
SORT_SWAP(TYPE, *left, *right); \
if (mid == left) \
mid = right; \
else if (mid == right) \
mid = left; \
++left; \
--right; \
} else if (left == right) { \
++left; \
--right; \
break; \
TYPE *right = hi - 1; \
TYPE *left = lo + 1; \
while (1) { \
while (CMP(*left, *mid)) \
++left; \
while (CMP(*mid, *right)) \
--right; \
if (unlikely(left > right)) { \
if (EXPECT_LOW_CARDINALITY_OR_PRESORTED) { \
if (NAME##_is_sorted(lo, right)) \
lo = right + 1; \
if (NAME##_is_sorted(left, hi)) \
hi = left; \
} \
} while (left <= right); \
\
if (lo + SORT_THRESHOLD > right) { \
if (left + SORT_THRESHOLD > hi) { \
if (top == stack) \
break; \
else \
SORT_POP(lo, hi); \
} else \
lo = left; \
} else if (left + SORT_THRESHOLD > hi) \
hi = right; \
else if (right - lo > hi - left) { \
SORT_PUSH(lo, right); \
lo = left; \
} else { \
SORT_PUSH(left, hi); \
hi = right; \
break; \
} \
SORT_SWAP(TYPE, *left, *right); \
mid = (mid == left) ? right : (mid == right) ? left : mid; \
++left; \
--right; \
} \
\
if (right - lo > hi - left) { \
SORT_PUSH(lo, right); \
lo = left; \
} else { \
SORT_PUSH(left, hi); \
hi = right; \
} \
} \
\
SORT_SHELLPASS(TYPE, CMP, begin, end, 8); \
SORT_SHELLPASS(TYPE, CMP, begin, end, 1); \
for (TYPE *scan = begin + 1; scan < end; ++scan) \
assert(CMP(scan[-1], scan[0])); \
}
@@ -1868,7 +2368,7 @@ static void __hot mdbx_pnl_xmerge(MDBX_PNL dst, const MDBX_PNL src) {
assert(mdbx_pnl_check4assert(dst, MAX_PAGENO + 1));
}
SORT_IMPL(pgno_sort, pgno_t, MDBX_PNL_ORDERED)
SORT_IMPL(pgno_sort, false, pgno_t, MDBX_PNL_ORDERED)
static __hot void mdbx_pnl_sort(MDBX_PNL pnl) {
pgno_sort(MDBX_PNL_BEGIN(pnl), MDBX_PNL_END(pnl));
assert(mdbx_pnl_check(pnl, MAX_PAGENO + 1));
@@ -1978,7 +2478,7 @@ static __always_inline void mdbx_txl_xappend(MDBX_TXL tl, txnid_t id) {
}
#define TXNID_SORT_CMP(first, last) ((first) > (last))
SORT_IMPL(txnid_sort, txnid_t, TXNID_SORT_CMP)
SORT_IMPL(txnid_sort, false, txnid_t, TXNID_SORT_CMP)
static void mdbx_txl_sort(MDBX_TXL tl) {
txnid_sort(MDBX_PNL_BEGIN(tl), MDBX_PNL_END(tl));
}
@@ -1996,7 +2496,7 @@ static int __must_check_result mdbx_txl_append(MDBX_TXL *ptl, txnid_t id) {
/*----------------------------------------------------------------------------*/
#define DP_SORT_CMP(first, last) ((first).pgno < (last).pgno)
SORT_IMPL(dp_sort, MDBX_DP, DP_SORT_CMP)
SORT_IMPL(dp_sort, false, MDBX_DP, DP_SORT_CMP)
static __always_inline MDBX_DPL mdbx_dpl_sort(MDBX_DPL dl) {
assert(dl->length <= MDBX_DPL_TXNFULL);
assert(dl->sorted <= dl->length);
@@ -2013,61 +2513,57 @@ static __always_inline MDBX_DPL mdbx_dpl_sort(MDBX_DPL dl) {
SEARCH_IMPL(dp_bsearch, MDBX_DP, pgno_t, DP_SEARCH_CMP)
static unsigned __hot mdbx_dpl_search(MDBX_DPL dl, pgno_t pgno) {
if (dl->sorted < dl->length) {
/* unsorted tail case */
if (mdbx_audit_enabled()) {
for (const MDBX_DP *ptr = dl + dl->sorted; --ptr > dl;) {
assert(ptr[0].pgno < ptr[1].pgno);
assert(ptr[0].pgno >= NUM_METAS);
}
}
/* try linear search until the threshold */
if (dl->length - dl->sorted < SORT_THRESHOLD / 2) {
unsigned i = dl->length;
while (i - dl->sorted > 7) {
if (dl[i].pgno == pgno)
return i;
if (dl[i - 1].pgno == pgno)
return i - 1;
if (dl[i - 2].pgno == pgno)
return i - 2;
if (dl[i - 3].pgno == pgno)
return i - 3;
if (dl[i - 4].pgno == pgno)
return i - 4;
if (dl[i - 5].pgno == pgno)
return i - 5;
if (dl[i - 6].pgno == pgno)
return i - 6;
if (dl[i - 7].pgno == pgno)
return i - 7;
i -= 8;
}
while (i > dl->sorted) {
if (dl[i].pgno == pgno)
return i;
--i;
}
MDBX_DPL it = dp_bsearch(dl + 1, i, pgno);
return (unsigned)(it - dl);
}
/* sort a whole */
dl->sorted = dl->length;
dp_sort(dl + 1, dl + dl->length + 1);
}
if (mdbx_audit_enabled()) {
for (const MDBX_DP *ptr = dl + dl->length; --ptr > dl;) {
for (const MDBX_DP *ptr = dl + dl->sorted; --ptr > dl;) {
assert(ptr[0].pgno < ptr[1].pgno);
assert(ptr[0].pgno >= NUM_METAS);
}
}
MDBX_DPL it = dp_bsearch(dl + 1, dl->length, pgno);
return (unsigned)(it - dl);
switch (dl->length - dl->sorted) {
default:
/* sort a whole */
dl->sorted = dl->length;
dp_sort(dl + 1, dl + dl->length + 1);
__fallthrough; /* fall through */
case 0:
/* whole sorted cases */
if (mdbx_audit_enabled()) {
for (const MDBX_DP *ptr = dl + dl->length; --ptr > dl;) {
assert(ptr[0].pgno < ptr[1].pgno);
assert(ptr[0].pgno >= NUM_METAS);
}
}
return (unsigned)(dp_bsearch(dl + 1, dl->length, pgno) - dl);
#define LINEAR_SEARCH_CASE(N) \
case N: \
if (dl[dl->length - N + 1].pgno == pgno) \
return dl->length - N + 1; \
__fallthrough
/* try linear search until the threshold */
LINEAR_SEARCH_CASE(16); /* fall through */
LINEAR_SEARCH_CASE(15); /* fall through */
LINEAR_SEARCH_CASE(14); /* fall through */
LINEAR_SEARCH_CASE(13); /* fall through */
LINEAR_SEARCH_CASE(12); /* fall through */
LINEAR_SEARCH_CASE(11); /* fall through */
LINEAR_SEARCH_CASE(10); /* fall through */
LINEAR_SEARCH_CASE(9); /* fall through */
LINEAR_SEARCH_CASE(8); /* fall through */
LINEAR_SEARCH_CASE(7); /* fall through */
LINEAR_SEARCH_CASE(6); /* fall through */
LINEAR_SEARCH_CASE(5); /* fall through */
LINEAR_SEARCH_CASE(4); /* fall through */
LINEAR_SEARCH_CASE(3); /* fall through */
LINEAR_SEARCH_CASE(2); /* fall through */
case 1:
if (dl[dl->length].pgno == pgno)
return dl->length;
/* continue bsearch on the sorted part */
return (unsigned)(dp_bsearch(dl + 1, dl->sorted, pgno) - dl);
}
}
static __always_inline MDBX_page *mdbx_dpl_find(MDBX_DPL dl, pgno_t pgno) {
@@ -2752,8 +3248,7 @@ static __cold __maybe_unused bool mdbx_dirtylist_check(MDBX_txn *txn) {
if (unlikely(loose != txn->tw.loose_count))
return false;
if (txn->tw.dirtylist->length - txn->tw.dirtylist->sorted <
SORT_THRESHOLD / 2) {
if (txn->tw.dirtylist->length - txn->tw.dirtylist->sorted < 16) {
for (unsigned i = 1; i <= MDBX_PNL_SIZE(txn->tw.retired_pages); ++i) {
const MDBX_page *const dp =
mdbx_dpl_find(txn->tw.dirtylist, txn->tw.retired_pages[i]);
@@ -5135,9 +5630,13 @@ static int mdbx_txn_renew0(MDBX_txn *txn, unsigned flags) {
txn->mt_flags |= MDBX_SHRINK_ALLOWED;
mdbx_srwlock_AcquireShared(&env->me_remap_guard);
}
#endif
#endif /* Windows */
} else {
env->me_dxb_mmap.current = size;
#if defined(_WIN32) || defined(_WIN64)
env->me_dxb_mmap.filesize =
(env->me_dxb_mmap.filesize < size) ? size : env->me_dxb_mmap.filesize;
#endif /* Windows */
}
#if defined(MDBX_USE_VALGRIND) || defined(__SANITIZE_ADDRESS__)
mdbx_txn_valgrind(env, txn);
@@ -7772,12 +8271,10 @@ static void __cold mdbx_setup_pagesize(MDBX_env *env, const size_t pagesize) {
mdbx_ensure(env, branch_nodemax > 42 && branch_nodemax < (int)UINT16_MAX &&
branch_nodemax % 2 == 0);
env->me_branch_nodemax = (unsigned)branch_nodemax;
env->me_maxkey_nd = (uint16_t)mdbx_limits_keysize_max(env->me_psize, 0);
env->me_maxkey_ds =
(uint16_t)mdbx_limits_keysize_max(env->me_psize, MDBX_DUPSORT);
env->me_maxval_nd = (unsigned)mdbx_limits_valsize_max(env->me_psize, 0);
env->me_maxval_ds =
(unsigned)mdbx_limits_valsize_max(env->me_psize, MDBX_DUPSORT);
env->me_maxkey_nd = (uint16_t)mdbx_limits_keysize_max(pagesize, 0);
env->me_maxkey_ds = (uint16_t)mdbx_limits_keysize_max(pagesize, MDBX_DUPSORT);
env->me_maxval_nd = (unsigned)mdbx_limits_valsize_max(pagesize, 0);
env->me_maxval_ds = (unsigned)mdbx_limits_valsize_max(pagesize, MDBX_DUPSORT);
mdbx_ensure(env, env->me_maxkey_nd ==
env->me_branch_nodemax - NODESIZE - sizeof(pgno_t));
mdbx_ensure(env, env->me_maxkey_ds ==
@@ -7988,22 +8485,24 @@ mdbx_env_set_geometry(MDBX_env *env, intptr_t size_lower, intptr_t size_now,
goto bailout;
}
size_lower = roundup_powerof2(size_lower, env->me_os_psize);
size_upper = roundup_powerof2(size_upper, env->me_os_psize);
size_now = roundup_powerof2(size_now, env->me_os_psize);
const size_t unit =
(env->me_os_psize > (size_t)pagesize) ? env->me_os_psize : pagesize;
size_lower = roundup_powerof2(size_lower, unit);
size_upper = roundup_powerof2(size_upper, unit);
size_now = roundup_powerof2(size_now, unit);
/* LY: подбираем значение size_upper:
* - кратное размеру системной страницы
* - кратное размеру страницы
* - без нарушения MAX_MAPSIZE и MAX_PAGENO */
while (unlikely((size_t)size_upper > MAX_MAPSIZE ||
(uint64_t)size_upper / pagesize > MAX_PAGENO)) {
if ((size_t)size_upper < env->me_os_psize + MIN_MAPSIZE ||
(size_t)size_upper < env->me_os_psize * (MIN_PAGENO + 1)) {
if ((size_t)size_upper < unit + MIN_MAPSIZE ||
(size_t)size_upper < (size_t)pagesize * (MIN_PAGENO + 1)) {
/* паранойа на случай переполнения при невероятных значениях */
rc = MDBX_EINVAL;
goto bailout;
}
size_upper -= env->me_os_psize;
size_upper -= unit;
if ((size_t)size_upper < (size_t)size_lower)
size_lower = size_upper;
}
@@ -8025,13 +8524,13 @@ mdbx_env_set_geometry(MDBX_env *env, intptr_t size_lower, intptr_t size_now,
}
if (growth_step == 0 && shrink_threshold > 0)
growth_step = 1;
growth_step = roundup_powerof2(growth_step, env->me_os_psize);
growth_step = roundup_powerof2(growth_step, unit);
if (bytes2pgno(env, growth_step) > UINT16_MAX)
growth_step = pgno2bytes(env, UINT16_MAX);
if (shrink_threshold < 0)
shrink_threshold = growth_step + growth_step;
shrink_threshold = roundup_powerof2(shrink_threshold, env->me_os_psize);
shrink_threshold = roundup_powerof2(shrink_threshold, unit);
if (bytes2pgno(env, shrink_threshold) > UINT16_MAX)
shrink_threshold = pgno2bytes(env, UINT16_MAX);
@@ -10868,7 +11367,7 @@ int mdbx_cursor_put(MDBX_cursor *mc, MDBX_val *key, MDBX_val *data,
} else if ((flags & MDBX_CURRENT) == 0) {
int exact = 0;
MDBX_val d2;
if (flags & MDBX_APPEND) {
if ((flags & MDBX_APPEND) != 0 && mc->mc_db->md_entries > 0) {
MDBX_val k2;
rc = mdbx_cursor_last(mc, &k2, &d2);
if (rc == 0) {
@@ -10954,7 +11453,7 @@ int mdbx_cursor_put(MDBX_cursor *mc, MDBX_val *key, MDBX_val *data,
if (IS_LEAF2(mc->mc_pg[mc->mc_top])) {
char *ptr;
unsigned ksize = mc->mc_db->md_xsize;
if (key->iov_len != ksize)
if (unlikely(key->iov_len != ksize))
return MDBX_BAD_VALSIZE;
ptr = page_leaf2key(mc->mc_pg[mc->mc_top], mc->mc_ki[mc->mc_top], ksize);
memcpy(ptr, key->iov_base, ksize);
@@ -14880,19 +15379,21 @@ static int mdbx_dbi_bind(MDBX_txn *txn, const MDBX_dbi dbi, unsigned user_flags,
}
}
if (!txn->mt_dbxs[dbi].md_cmp || MDBX_DEBUG) {
if (!keycmp)
keycmp = mdbx_default_keycmp(user_flags);
mdbx_tassert(txn, !txn->mt_dbxs[dbi].md_cmp ||
txn->mt_dbxs[dbi].md_cmp == keycmp);
if (!keycmp)
keycmp = txn->mt_dbxs[dbi].md_cmp ? txn->mt_dbxs[dbi].md_cmp
: mdbx_default_keycmp(user_flags);
if (txn->mt_dbxs[dbi].md_cmp != keycmp) {
if (txn->mt_dbxs[dbi].md_cmp)
return MDBX_EINVAL;
txn->mt_dbxs[dbi].md_cmp = keycmp;
}
if (!txn->mt_dbxs[dbi].md_dcmp || MDBX_DEBUG) {
if (!datacmp)
datacmp = mdbx_default_datacmp(user_flags);
mdbx_tassert(txn, !txn->mt_dbxs[dbi].md_dcmp ||
txn->mt_dbxs[dbi].md_dcmp == datacmp);
if (!datacmp)
datacmp = txn->mt_dbxs[dbi].md_dcmp ? txn->mt_dbxs[dbi].md_dcmp
: mdbx_default_datacmp(user_flags);
if (txn->mt_dbxs[dbi].md_dcmp != datacmp) {
if (txn->mt_dbxs[dbi].md_dcmp)
return MDBX_EINVAL;
txn->mt_dbxs[dbi].md_dcmp = datacmp;
}
@@ -14902,13 +15403,14 @@ static int mdbx_dbi_bind(MDBX_txn *txn, const MDBX_dbi dbi, unsigned user_flags,
int mdbx_dbi_open_ex(MDBX_txn *txn, const char *table_name, unsigned user_flags,
MDBX_dbi *dbi, MDBX_cmp_func *keycmp,
MDBX_cmp_func *datacmp) {
if (unlikely(!dbi || (user_flags & ~VALID_FLAGS) != 0))
return MDBX_EINVAL;
*dbi = (MDBX_dbi)-1;
int rc = check_txn(txn, MDBX_TXN_BLOCKED);
if (unlikely(rc != MDBX_SUCCESS))
return rc;
if (unlikely(!dbi || (user_flags & ~VALID_FLAGS) != 0))
return MDBX_EINVAL;
switch (user_flags &
(MDBX_INTEGERDUP | MDBX_DUPFIXED | MDBX_DUPSORT | MDBX_REVERSEDUP)) {
default:
@@ -14925,8 +15427,10 @@ int mdbx_dbi_open_ex(MDBX_txn *txn, const char *table_name, unsigned user_flags,
/* main table? */
if (!table_name) {
*dbi = MAIN_DBI;
return mdbx_dbi_bind(txn, MAIN_DBI, user_flags, keycmp, datacmp);
rc = mdbx_dbi_bind(txn, MAIN_DBI, user_flags, keycmp, datacmp);
if (likely(rc == MDBX_SUCCESS))
*dbi = MAIN_DBI;
return rc;
}
if (txn->mt_dbxs[MAIN_DBI].md_cmp == NULL) {

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -154,7 +154,7 @@
#endif /* __fallthrough */
#ifndef __unreachable
# if __GNUC_PREREQ(4,5)
# if __GNUC_PREREQ(4,5) || __has_builtin(__builtin_unreachable)
# define __unreachable() __builtin_unreachable()
# elif defined(_MSC_VER)
# define __unreachable() __assume(0)
@@ -294,7 +294,7 @@
#endif /* __flatten */
#ifndef likely
# if (defined(__GNUC__) || defined(__clang__)) && !defined(__COVERITY__)
# if (defined(__GNUC__) || __has_builtin(__builtin_expect)) && !defined(__COVERITY__)
# define likely(cond) __builtin_expect(!!(cond), 1)
# else
# define likely(x) (x)
@@ -302,7 +302,7 @@
#endif /* likely */
#ifndef unlikely
# if (defined(__GNUC__) || defined(__clang__)) && !defined(__COVERITY__)
# if (defined(__GNUC__) || __has_builtin(__builtin_expect)) && !defined(__COVERITY__)
# define unlikely(cond) __builtin_expect(!!(cond), 0)
# else
# define unlikely(x) (x)

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,7 +1,7 @@
/* https://en.wikipedia.org/wiki/Operating_system_abstraction_layer */
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -1417,12 +1417,12 @@ MDBX_INTERNAL_FUNC int mdbx_mresize(int flags, mdbx_mmap_t *map, size_t size,
/* growth rw-section */
SectionSize.QuadPart = size;
status = NtExtendSection(map->section, &SectionSize);
if (NT_SUCCESS(status)) {
map->current = size;
if (map->filesize < size)
map->filesize = size;
}
return ntstatus2errcode(status);
if (!NT_SUCCESS(status))
return ntstatus2errcode(status);
map->current = size;
if (map->filesize < size)
map->filesize = size;
return MDBX_SUCCESS;
}
if (limit > map->limit) {
@@ -1431,11 +1431,10 @@ MDBX_INTERNAL_FUNC int mdbx_mresize(int flags, mdbx_mmap_t *map, size_t size,
SIZE_T RegionSize = limit - map->limit;
status = NtAllocateVirtualMemory(GetCurrentProcess(), &BaseAddress, 0,
&RegionSize, MEM_RESERVE, PAGE_NOACCESS);
if (!NT_SUCCESS(status)) {
if (status == /* STATUS_INVALID_ADDRESS */ 0xC0000141)
status = /* STATUS_CONFLICTING_ADDRESSES */ 0xC0000018;
if (status == /* STATUS_CONFLICTING_ADDRESSES */ 0xC0000018)
return MDBX_RESULT_TRUE;
if (!NT_SUCCESS(status))
return ntstatus2errcode(status);
}
status = NtFreeVirtualMemory(GetCurrentProcess(), &BaseAddress, &RegionSize,
MEM_RELEASE);
@@ -1462,9 +1461,13 @@ MDBX_INTERNAL_FUNC int mdbx_mresize(int flags, mdbx_mmap_t *map, size_t size,
bailout:
map->address = NULL;
map->current = map->limit = 0;
if (ReservedAddress)
(void)NtFreeVirtualMemory(GetCurrentProcess(), &ReservedAddress,
&ReservedSize, MEM_RELEASE);
if (ReservedAddress) {
ReservedSize = 0;
status = NtFreeVirtualMemory(GetCurrentProcess(), &ReservedAddress,
&ReservedSize, MEM_RELEASE);
assert(NT_SUCCESS(status));
(void)status;
}
return err;
}
@@ -1475,8 +1478,7 @@ MDBX_INTERNAL_FUNC int mdbx_mresize(int flags, mdbx_mmap_t *map, size_t size,
&ReservedSize, MEM_RESERVE, PAGE_NOACCESS);
if (!NT_SUCCESS(status)) {
ReservedAddress = NULL;
if (status != /* STATUS_CONFLICTING_ADDRESSES */ 0xC0000018 &&
status != /* STATUS_INVALID_ADDRESS */ 0xC0000141)
if (status != /* STATUS_CONFLICTING_ADDRESSES */ 0xC0000018)
goto bailout_ntstatus /* no way to recovery */;
/* assume we can change base address if mapping size changed or prev address
@@ -1516,6 +1518,7 @@ retry_file_and_section:
if (ReservedAddress) {
/* release reserved address space */
ReservedSize = 0;
status = NtFreeVirtualMemory(GetCurrentProcess(), &ReservedAddress,
&ReservedSize, MEM_RELEASE);
ReservedAddress = NULL;
@@ -1536,8 +1539,7 @@ retry_mapview:;
(flags & MDBX_WRITEMAP) ? PAGE_READWRITE : PAGE_READONLY);
if (!NT_SUCCESS(status)) {
if ((status == /* STATUS_CONFLICTING_ADDRESSES */ 0xC0000018 ||
status == /* STATUS_INVALID_ADDRESS */ 0xC0000141) &&
if (status == /* STATUS_CONFLICTING_ADDRESSES */ 0xC0000018 &&
map->address) {
/* try remap at another base address */
map->address = NULL;

View File

@@ -1,7 +1,7 @@
/* https://en.wikipedia.org/wiki/Operating_system_abstraction_layer */
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,6 +1,6 @@
.\" Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
.\" Copying restrictions apply. See COPYRIGHT/LICENSE.
.TH MDBX_CHK 1 "2019-12-05" "MDBX 0.4.x"
.TH MDBX_CHK 1 "2020-01-20" "MDBX 0.6.x"
.SH NAME
mdbx_chk \- MDBX checking tool
.SH SYNOPSIS

View File

@@ -1,8 +1,8 @@
.\" Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2012-2015 Howard Chu, Symas Corp. All Rights Reserved.
.\" Copyright 2015,2016 Peter-Service R&D LLC <http://billing.ru/>.
.\" Copying restrictions apply. See COPYRIGHT/LICENSE.
.TH MDBX_COPY 1 "2019-12-05" "MDBX 0.4.x"
.TH MDBX_COPY 1 "2020-01-20" "MDBX 0.6.x"
.SH NAME
mdbx_copy \- MDBX environment copy tool
.SH SYNOPSIS

View File

@@ -1,8 +1,8 @@
.\" Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2014-2015 Howard Chu, Symas Corp. All Rights Reserved.
.\" Copyright 2015,2016 Peter-Service R&D LLC <http://billing.ru/>.
.\" Copying restrictions apply. See COPYRIGHT/LICENSE.
.TH MDBX_DUMP 1 "2019-12-05" "MDBX 0.4.x"
.TH MDBX_DUMP 1 "2020-01-20" "MDBX 0.6.x"
.SH NAME
mdbx_dump \- MDBX environment export tool
.SH SYNOPSIS

View File

@@ -1,8 +1,8 @@
.\" Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2014-2015 Howard Chu, Symas Corp. All Rights Reserved.
.\" Copyright 2015,2016 Peter-Service R&D LLC <http://billing.ru/>.
.\" Copying restrictions apply. See COPYRIGHT/LICENSE.
.TH MDBX_LOAD 1 "2019-12-05" "MDBX 0.4.x"
.TH MDBX_LOAD 1 "2020-01-20" "MDBX 0.6.x"
.SH NAME
mdbx_load \- MDBX environment import tool
.SH SYNOPSIS

View File

@@ -1,8 +1,8 @@
.\" Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>.
.\" Copyright 2012-2015 Howard Chu, Symas Corp. All Rights Reserved.
.\" Copyright 2015,2016 Peter-Service R&D LLC <http://billing.ru/>.
.\" Copying restrictions apply. See COPYRIGHT/LICENSE.
.TH MDBX_STAT 1 "2019-12-05" "MDBX 0.4.x"
.TH MDBX_STAT 1 "2020-01-20" "MDBX 0.6.x"
.SH NAME
mdbx_stat \- MDBX environment status tool
.SH SYNOPSIS

View File

@@ -1,7 +1,7 @@
/* mdbx_chk.c - memory-mapped database check tool */
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,7 +1,7 @@
/* mdbx_copy.c - memory-mapped database backup tool */
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,7 +1,7 @@
/* mdbx_dump.c - memory-mapped database dump tool */
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,7 +1,7 @@
/* mdbx_load.c - memory-mapped database load tool */
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -55,7 +55,7 @@ static int version;
static int dbi_flags;
static char *prog;
static int Eof;
static bool Eof;
static MDBX_envinfo envinfo;
static MDBX_val kbuf, dbuf;
@@ -85,20 +85,32 @@ static void readhdr(void) {
dbi_flags = 0;
while (fgets(dbuf.iov_base, (int)dbuf.iov_len, stdin) != NULL) {
lineno++;
if (!strncmp(dbuf.iov_base, "db_pagesize=", STRLENOF("db_pagesize=")) ||
!strncmp(dbuf.iov_base, "duplicates=", STRLENOF("duplicates="))) {
/* LY: silently ignore information fields. */
if (!strncmp(dbuf.iov_base, "db_pagesize=", STRLENOF("db_pagesize="))) {
envinfo.mi_dxb_pagesize =
atoi((char *)dbuf.iov_base + STRLENOF("db_pagesize="));
continue;
} else if (!strncmp(dbuf.iov_base, "VERSION=", STRLENOF("VERSION="))) {
}
if (!strncmp(dbuf.iov_base, "duplicates=", STRLENOF("duplicates="))) {
dbi_flags |= MDBX_DUPSORT;
continue;
}
if (!strncmp(dbuf.iov_base, "VERSION=", STRLENOF("VERSION="))) {
version = atoi((char *)dbuf.iov_base + STRLENOF("VERSION="));
if (version > 3) {
fprintf(stderr, "%s: line %" PRIiSIZE ": unsupported VERSION %d\n",
prog, lineno, version);
exit(EXIT_FAILURE);
}
} else if (!strncmp(dbuf.iov_base, "HEADER=END", STRLENOF("HEADER=END"))) {
break;
} else if (!strncmp(dbuf.iov_base, "format=", STRLENOF("format="))) {
continue;
}
if (!strncmp(dbuf.iov_base, "HEADER=END", STRLENOF("HEADER=END")))
return;
if (!strncmp(dbuf.iov_base, "format=", STRLENOF("format="))) {
if (!strncmp((char *)dbuf.iov_base + STRLENOF("FORMAT="), "print",
STRLENOF("print")))
mode |= PRINT;
@@ -108,21 +120,30 @@ static void readhdr(void) {
lineno, (char *)dbuf.iov_base + STRLENOF("FORMAT="));
exit(EXIT_FAILURE);
}
} else if (!strncmp(dbuf.iov_base, "database=", STRLENOF("database="))) {
continue;
}
if (!strncmp(dbuf.iov_base, "database=", STRLENOF("database="))) {
ptr = memchr(dbuf.iov_base, '\n', dbuf.iov_len);
if (ptr)
*ptr = '\0';
if (subname)
mdbx_free(subname);
subname = mdbx_strdup((char *)dbuf.iov_base + STRLENOF("database="));
} else if (!strncmp(dbuf.iov_base, "type=", STRLENOF("type="))) {
continue;
}
if (!strncmp(dbuf.iov_base, "type=", STRLENOF("type="))) {
if (strncmp((char *)dbuf.iov_base + STRLENOF("type="), "btree",
STRLENOF("btree"))) {
fprintf(stderr, "%s: line %" PRIiSIZE ": unsupported type %s\n", prog,
lineno, (char *)dbuf.iov_base + STRLENOF("type="));
exit(EXIT_FAILURE);
}
} else if (!strncmp(dbuf.iov_base, "mapaddr=", STRLENOF("mapaddr="))) {
continue;
}
if (!strncmp(dbuf.iov_base, "mapaddr=", STRLENOF("mapaddr="))) {
int i;
ptr = memchr(dbuf.iov_base, '\n', dbuf.iov_len);
if (ptr)
@@ -134,7 +155,10 @@ static void readhdr(void) {
lineno, (char *)dbuf.iov_base + STRLENOF("mapaddr="));
exit(EXIT_FAILURE);
}
} else if (!strncmp(dbuf.iov_base, "mapsize=", STRLENOF("mapsize="))) {
continue;
}
if (!strncmp(dbuf.iov_base, "mapsize=", STRLENOF("mapsize="))) {
int i;
ptr = memchr(dbuf.iov_base, '\n', dbuf.iov_len);
if (ptr)
@@ -146,8 +170,10 @@ static void readhdr(void) {
lineno, (char *)dbuf.iov_base + STRLENOF("mapsize="));
exit(EXIT_FAILURE);
}
} else if (!strncmp(dbuf.iov_base,
"maxreaders=", STRLENOF("maxreaders="))) {
continue;
}
if (!strncmp(dbuf.iov_base, "maxreaders=", STRLENOF("maxreaders="))) {
int i;
ptr = memchr(dbuf.iov_base, '\n', dbuf.iov_len);
if (ptr)
@@ -159,31 +185,33 @@ static void readhdr(void) {
lineno, (char *)dbuf.iov_base + STRLENOF("maxreaders="));
exit(EXIT_FAILURE);
}
} else {
int i;
for (i = 0; dbflags[i].bit; i++) {
if (!strncmp(dbuf.iov_base, dbflags[i].name, dbflags[i].len) &&
((char *)dbuf.iov_base)[dbflags[i].len] == '=') {
if (((char *)dbuf.iov_base)[dbflags[i].len + 1] == '1')
dbi_flags |= dbflags[i].bit;
break;
}
continue;
}
int i;
for (i = 0; dbflags[i].bit; i++) {
if (!strncmp(dbuf.iov_base, dbflags[i].name, dbflags[i].len) &&
((char *)dbuf.iov_base)[dbflags[i].len] == '=') {
if (((char *)dbuf.iov_base)[dbflags[i].len + 1] == '1')
dbi_flags |= dbflags[i].bit;
break;
}
if (!dbflags[i].bit) {
ptr = memchr(dbuf.iov_base, '=', dbuf.iov_len);
if (!ptr) {
fprintf(stderr, "%s: line %" PRIiSIZE ": unexpected format\n", prog,
lineno);
exit(EXIT_FAILURE);
} else {
*ptr = '\0';
fprintf(stderr,
"%s: line %" PRIiSIZE ": unrecognized keyword ignored: %s\n",
prog, lineno, (char *)dbuf.iov_base);
}
}
if (!dbflags[i].bit) {
ptr = memchr(dbuf.iov_base, '=', dbuf.iov_len);
if (!ptr) {
fprintf(stderr, "%s: line %" PRIiSIZE ": unexpected format\n", prog,
lineno);
exit(EXIT_FAILURE);
} else {
*ptr = '\0';
fprintf(stderr,
"%s: line %" PRIiSIZE ": unrecognized keyword ignored: %s\n",
prog, lineno, (char *)dbuf.iov_base);
}
}
}
Eof = true;
}
static void badend(void) {
@@ -212,14 +240,14 @@ static int readline(MDBX_val *out, MDBX_val *buf) {
if (!(mode & NOHDR)) {
c = fgetc(stdin);
if (c == EOF) {
Eof = 1;
Eof = true;
return EOF;
}
if (c != ' ') {
lineno++;
if (fgets(buf->iov_base, (int)buf->iov_len, stdin) == NULL) {
badend:
Eof = 1;
Eof = true;
badend();
return EOF;
}
@@ -229,7 +257,7 @@ static int readline(MDBX_val *out, MDBX_val *buf) {
}
}
if (fgets(buf->iov_base, (int)buf->iov_len, stdin) == NULL) {
Eof = 1;
Eof = true;
return EOF;
}
lineno++;
@@ -242,7 +270,7 @@ static int readline(MDBX_val *out, MDBX_val *buf) {
while (c1[len - 1] != '\n') {
buf->iov_base = mdbx_realloc(buf->iov_base, buf->iov_len * 2);
if (!buf->iov_base) {
Eof = 1;
Eof = true;
fprintf(stderr, "%s: line %" PRIiSIZE ": out of memory, line too long\n",
prog, lineno);
return EOF;
@@ -250,7 +278,7 @@ static int readline(MDBX_val *out, MDBX_val *buf) {
c1 = buf->iov_base;
c1 += l2;
if (fgets((char *)c1, (int)buf->iov_len + 1, stdin) == NULL) {
Eof = 1;
Eof = true;
badend();
return EOF;
}
@@ -270,7 +298,7 @@ static int readline(MDBX_val *out, MDBX_val *buf) {
*c1++ = '\\';
} else {
if (c2 + 3 > end || !isxdigit(c2[1]) || !isxdigit(c2[2])) {
Eof = 1;
Eof = true;
badend();
return EOF;
}
@@ -285,13 +313,13 @@ static int readline(MDBX_val *out, MDBX_val *buf) {
} else {
/* odd length not allowed */
if (len & 1) {
Eof = 1;
Eof = true;
badend();
return EOF;
}
while (c2 < end) {
if (!isxdigit(*c2) || !isxdigit(c2[1])) {
Eof = 1;
Eof = true;
badend();
return EOF;
}
@@ -366,7 +394,7 @@ int main(int argc, char *argv[]) {
break;
case 'f':
if (freopen(optarg, "r", stdin) == NULL) {
fprintf(stderr, "%s: %s: reopen: %s\n", prog, optarg,
fprintf(stderr, "%s: %s: open: %s\n", prog, optarg,
mdbx_strerror(errno));
exit(EXIT_FAILURE);
}
@@ -433,16 +461,20 @@ int main(int argc, char *argv[]) {
mdbx_env_set_maxdbs(env, 2);
if (envinfo.mi_maxreaders)
mdbx_env_set_maxreaders(env, envinfo.mi_maxreaders);
#ifdef MDBX_FIXEDMAP
if (info.mi_mapaddr)
envflags |= MDBX_FIXEDMAP;
#endif
if (envinfo.mi_mapsize) {
if (envinfo.mi_mapsize > SIZE_MAX) {
fprintf(stderr, "mdbx_env_set_mapsize failed, error %d %s\n", rc,
mdbx_strerror(MDBX_TOO_LARGE));
return EXIT_FAILURE;
if (envinfo.mi_mapsize > INTPTR_MAX) {
fprintf(stderr,
"Database size is too large for current system (mapsize=%" PRIu64
" is great than system-limit %zi)\n",
envinfo.mi_mapsize, INTPTR_MAX);
goto env_close;
}
rc = mdbx_env_set_geometry(env, 0, -1, (size_t)envinfo.mi_mapsize, -1, -1,
rc = mdbx_env_set_geometry(env, 0, 0, (intptr_t)envinfo.mi_mapsize, -1, -1,
-1);
if (rc) {
fprintf(stderr, "mdbx_env_set_geometry failed, error %d %s\n", rc,
@@ -451,11 +483,6 @@ int main(int argc, char *argv[]) {
}
}
#ifdef MDBX_FIXEDMAP
if (info.mi_mapaddr)
envflags |= MDBX_FIXEDMAP;
#endif
rc = mdbx_env_open(env, envname, envflags, 0664);
if (rc) {
fprintf(stderr, "mdbx_env_open failed, error %d %s\n", rc,
@@ -464,7 +491,7 @@ int main(int argc, char *argv[]) {
}
kbuf.iov_len = mdbx_env_get_maxvalsize_ex(env, MDBX_DUPSORT);
if (kbuf.iov_len >= SIZE_MAX / 4) {
if (kbuf.iov_len >= INTPTR_MAX / 4) {
fprintf(stderr, "mdbx_env_get_maxkeysize failed, returns %zu\n",
kbuf.iov_len);
goto env_close;
@@ -574,6 +601,7 @@ int main(int argc, char *argv[]) {
goto env_close;
}
mdbx_dbi_close(env, dbi);
subname = NULL;
/* try read next header */
if (!(mode & NOHDR))

View File

@@ -1,7 +1,7 @@
/* mdbx_stat.c - memory-mapped database status tool */
/*
* Copyright 2015-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2015-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -417,8 +417,9 @@ void dump(const char *title) {
else
log_verbose("no-inject-writefault\n");
log_verbose("limits: readers %u, tables %u\n", i->params.max_readers,
i->params.max_tables);
log_verbose("limits: readers %u, tables %u, txn-bytes %zu\n",
i->params.max_readers, i->params.max_tables,
mdbx_limits_txnsize_max(i->params.pagesize));
log_verbose("drop table: %s\n", i->params.drop_table ? "Yes" : "No");
log_verbose("ignore MDBX_MAP_FULL error: %s\n",

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -15,22 +15,15 @@
#include "test.h"
bool testcase_jitter::run() {
int err;
size_t upper_limit = config.params.size_upper;
if (upper_limit < 1)
upper_limit = config.params.size_now * 2;
while (should_continue()) {
jitter_delay();
db_open();
if (flipcoin()) {
jitter_delay();
txn_begin(true);
fetch_canary();
jitter_delay();
txn_end(flipcoin());
}
int err;
intptr_t upper_limit = config.params.size_upper;
if (upper_limit < 1)
upper_limit = config.params.size_now;
if (upper_limit < 1) {
MDBX_envinfo info;
err = mdbx_env_info_ex(db_guard.get(), txn_guard.get(), &info,
@@ -42,30 +35,33 @@ bool testcase_jitter::run() {
: INTPTR_MAX;
}
const bool xcoin = flipcoin();
err = mdbx_env_set_geometry(db_guard.get(), -1, -1,
xcoin ? upper_limit / 2 : upper_limit * 3 / 2,
-1, -1, -1);
if (err != MDBX_SUCCESS && err != MDBX_RESULT_TRUE &&
err != MDBX_MAP_FULL && err != MDBX_TOO_LARGE)
failure_perror("mdbx_env_set_geometry-1", err);
if (flipcoin()) {
jitter_delay();
txn_begin(true);
fetch_canary();
jitter_delay();
txn_end(flipcoin());
}
const bool coin4size = flipcoin();
jitter_delay();
txn_begin(mode_readonly());
jitter_delay();
if (!mode_readonly()) {
fetch_canary();
update_canary(1);
/* TODO:
* - db_setsize()
* ...
*/
err = mdbx_env_set_geometry(
db_guard.get(), -1, -1,
coin4size ? upper_limit * 2 / 3 : upper_limit * 3 / 2, -1, -1, -1);
if (err != MDBX_SUCCESS && err != MDBX_RESULT_TRUE &&
err != MDBX_MAP_FULL && err != MDBX_TOO_LARGE)
failure_perror("mdbx_env_set_geometry-1", err);
}
txn_end(flipcoin());
err = mdbx_env_set_geometry(db_guard.get(), -1, -1,
!xcoin ? upper_limit / 2 : upper_limit * 3 / 2,
-1, -1, -1);
err = mdbx_env_set_geometry(
db_guard.get(), -1, -1,
!coin4size ? upper_limit * 2 / 3 : upper_limit * 3 / 2, -1, -1, -1);
if (err != MDBX_SUCCESS && err != MDBX_RESULT_TRUE &&
err != MDBX_MAP_FULL && err != MDBX_TOO_LARGE)
failure_perror("mdbx_env_set_geometry-2", err);

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -61,8 +61,7 @@ static FILE *last;
void setlevel(loglevel priority) {
level = priority;
int rc = mdbx_setup_debug(priority,
MDBX_DBG_ASSERT | MDBX_DBG_AUDIT | MDBX_DBG_JITTER |
MDBX_DBG_DUMP,
MDBX_DBG_ASSERT | MDBX_DBG_AUDIT | MDBX_DBG_JITTER,
mdbx_logger);
log_trace("set mdbx debug-opts: 0x%02x", rc);
}

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*
@@ -74,8 +74,8 @@ void __noreturn usage(void) {
" --speculum[=yes|NO] Use internal `speculum` to check "
"dataset\n"
"Keys and Value:\n"
" --keygen.min=N Minimal keys length\n"
" --keygen.max=N Miximal keys length\n"
" --keylen.min=N Minimal keys length\n"
" --keylen.max=N Miximal keys length\n"
" --datalen.min=N Minimal data length\n"
" --datalen.max=N Miximal data length\n"
" --keygen.width=N TBD (see the source code)\n"

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2016-2019 Leonid Yuriev <leo@yuriev.ru>.
* Copyright 2016-2020 Leonid Yuriev <leo@yuriev.ru>.
* Copyright 2015 Vladimir Romanov
* <https://www.linkedin.com/in/vladimirromanov>, Yota Lab.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*

View File

@@ -1,5 +1,5 @@
/*
* Copyright 2017-2019 Leonid Yuriev <leo@yuriev.ru>
* Copyright 2017-2020 Leonid Yuriev <leo@yuriev.ru>
* and other libmdbx authors: please see AUTHORS file.
* All rights reserved.
*