From: Ilias Tsitsimpis Date: Sun, 10 Sep 2023 19:09:33 +0000 (+0300) Subject: Import haskell-unordered-containers_0.2.19.1.orig.tar.gz X-Git-Tag: archive/raspbian/0.2.19.1-2+rpi1~1^2~3 X-Git-Url: https://dgit.raspbian.org/?a=commitdiff_plain;h=7f30ec71fe334510bd1316465ee9ab3134c21267;p=haskell-unordered-containers.git Import haskell-unordered-containers_0.2.19.1.orig.tar.gz [dgit import orig haskell-unordered-containers_0.2.19.1.orig.tar.gz] --- 7f30ec71fe334510bd1316465ee9ab3134c21267 diff --git a/CHANGES.md b/CHANGES.md new file mode 100644 index 0000000..28e194a --- /dev/null +++ b/CHANGES.md @@ -0,0 +1,240 @@ +## [0.2.19.1] – April 2022 + +* [Fix bug in `intersection[With[Key]]`](https://github.com/haskell-unordered-containers/unordered-containers/pull/427) + +* [Improve docs of bit twiddling functions](https://github.com/haskell-unordered-containers/unordered-containers/pull/396) + +[0.2.19.1]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.19.0...v0.2.19.1 + +## [0.2.19.0] – April 2022 + +* [Make intersections much faster](https://github.com/haskell-unordered-containers/unordered-containers/pull/406) + +* [Fix undefined behaviour on 32-bit platforms](https://github.com/haskell-unordered-containers/unordered-containers/pull/413) + +* Speed up some array-appending operations: [#407](https://github.com/haskell-unordered-containers/unordered-containers/pull/407), [#409](https://github.com/haskell-unordered-containers/unordered-containers/pull/409) + +* [Use MathJax format for complexity annotations](https://github.com/haskell-unordered-containers/unordered-containers/pull/411) + +[0.2.19.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.18.0...v0.2.19.0 + +## [0.2.18.0] + +* [Fix strictness properties of `Strict.mapMaybe[WithKey]`](https://github.com/haskell-unordered-containers/unordered-containers/pull/385) + +* [Fix strictness properties of `Strict.alterFEager`](https://github.com/haskell-unordered-containers/unordered-containers/pull/384) + +* [Fix space leaks in `union[With[Key]]`](https://github.com/haskell-unordered-containers/unordered-containers/pull/380) + +* [Fix space leak in `Lazy.fromListWith`](https://github.com/haskell-unordered-containers/unordered-containers/pull/386) + +* [Speed up `difference*` and `intersection*` with `unsafeInsert`](https://github.com/haskell-unordered-containers/unordered-containers/pull/372) + +* [`unionArrayBy`: Find next 1-bits with `countTrailingZeros`](https://github.com/haskell-unordered-containers/unordered-containers/pull/395) + - This speeds up `union*` for sparsely filled nodes, while penalizing `union` operations on densely filled nodes. + +* [Reduce reboxing in internal array operations](https://github.com/haskell-unordered-containers/unordered-containers/pull/377) + +* [Reduce code size of array operations in `union*`](https://github.com/haskell-unordered-containers/unordered-containers/pull/376) + +[0.2.18.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.17.0...v0.2.18.0 + +## [0.2.17.0] + +* [Define `dataCast1` for `HashMap`](https://github.com/haskell-unordered-containers/unordered-containers/pull/345) + +* [Add `Lift` instances for Template Haskell](https://github.com/haskell-unordered-containers/unordered-containers/pull/343) + +* [Add definitions for `stimes`](https://github.com/haskell-unordered-containers/unordered-containers/pull/340) + +* [Expose internal constructors for `HashSet`, `Array` and `MArray`](https://github.com/haskell-unordered-containers/unordered-containers/pull/347) + +* [Tweak internal `Array.insertM` function](https://github.com/haskell-unordered-containers/unordered-containers/pull/359) + +* [Drop support for GHC 8.0](https://github.com/haskell-unordered-containers/unordered-containers/pull/354) + +* [Drop support for `hashable < 1.2.5`](https://github.com/haskell-unordered-containers/unordered-containers/pull/355) + +* Various cleanup and documentation improvements + +[0.2.17.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.16.0...v0.2.17.0 + +## [0.2.16.0] + +* [Increase maximum branching factor from 16 to 32](https://github.com/haskell-unordered-containers/unordered-containers/pull/317) + +* [Tweak `union.goDifferentHash`](https://github.com/haskell-unordered-containers/unordered-containers/pull/277) + +* [Fix debug mode bounds check in `cloneM`](https://github.com/haskell-unordered-containers/unordered-containers/pull/331) + +* [Remove some old internal compatibility code](https://github.com/haskell-unordered-containers/unordered-containers/pull/334) + +[0.2.16.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.15.0...v0.2.16.0 + +## [0.2.15.0] + +* [Add security advisory regarding hash collision attacks](https://github.com/haskell-unordered-containers/unordered-containers/pull/320) + +* [Add support for hashable 1.4](https://github.com/haskell-unordered-containers/unordered-containers/pull/324) + +* [Drop support for GHC < 8](https://github.com/haskell-unordered-containers/unordered-containers/pull/323) + +[0.2.15.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.14.0...v0.2.15.0 + +## [0.2.14.0] + +* [Add `HashMap.mapKeys`.](https://github.com/haskell-unordered-containers/unordered-containers/pull/308) Thanks, Marco Perone! + +* [Add instances for `NFData1` and `NFData2`.](https://github.com/haskell-unordered-containers/unordered-containers/pull/314) Thanks, Isaac Elliott and Oleg Grenrus! + +* [Fix `@since`-annotation for `compose`.](https://github.com/haskell-unordered-containers/unordered-containers/pull/303) Thanks, @Mathnerd314! + +[0.2.14.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.13.0...v0.2.14.0 + +## [0.2.13.0] + +* [Add `HashMap.compose`.](https://github.com/haskell-unordered-containers/unordered-containers/pull/299) Thanks Alexandre Esteves. + +[0.2.13.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.12.0...v0.2.13.0 + +## [0.2.12.0] + +* Add `HashMap.isSubmapOf[By]` and `HashSet.isSubsetOf`. Thanks Sven Keidel. ([#282]) + +* Expose internal modules. ([#283]) + +* Documentation improvements in `Data.HashSet`, including a beginner-friendly + introduction. Thanks Matt Renaud. ([#267]) + +* `HashMap.alterF`: Skip key deletion for absent keys. ([#288]) + +* Remove custom `unsafeShift{L,R}` definitions. ([#281]) + +* Various other documentation improvements. + +[0.2.12.0]: https://github.com/haskell-unordered-containers/unordered-containers/compare/v0.2.11.0...v0.2.12.0 +[#267]: https://github.com/haskell-unordered-containers/unordered-containers/pull/267 +[#281]: https://github.com/haskell-unordered-containers/unordered-containers/pull/281 +[#282]: https://github.com/haskell-unordered-containers/unordered-containers/pull/282 +[#283]: https://github.com/haskell-unordered-containers/unordered-containers/pull/283 +[#288]: https://github.com/haskell-unordered-containers/unordered-containers/pull/288 + +## 0.2.11.0 + + * Add `HashMap.findWithDefault` (soft-deprecates `HashMap.lookupDefault`). + Thanks, Matt Renaud. + + * Add `HashMap.fromListWithKey`. Thanks, Josef Svenningsson. + + * Add more folding functions and use them in `Foldable` instances. Thanks, + David Feuer. + + * Add `HashMap.!?`, a flipped version of `lookup`. Thanks, Matt Renaud. + + * Add a `Bifoldable` instance for `HashMap`. Thanks, Joseph Sible. + + * Add a `HasCallStack` constraint to `(!)`. Thanks, Roman Cheplyaka. + +### Bug fixes + + * Fix a space leak affecting updates on keys with hash collisions. Thanks, + Neil Mitchell. ([#254]) + + * Get rid of some silly thunks that could be left lying around. ([#232]). + Thanks, David Feuer. + +### Other changes + + * Speed up the `Hashable` instances for `HashMap` and `HashSet`. Thanks, + Edward Amsden. + + * Remove a dependency cycle hack from the benchmark suite. Thanks, + Andrew Martin. + + * Improve documentation. Thanks, Tristan McLeay, Li-yao Xia, Gareth Smith, + Simon Jakobi, Sergey Vinokurov, and likely others. + +[#232]: https://github.com/haskell-unordered-containers/unordered-containers/issues/232 +[#254]: https://github.com/haskell-unordered-containers/unordered-containers/issues/254 + +## 0.2.10.0 + + * Add `HashMap.alterF`. + + * Add `HashMap.keysSet`. + + * Make `HashMap.Strict.traverseWithKey` force the results before + installing them in the map. + +## 0.2.9.0 + + * Add `Ord/Ord1/Ord2` instances. (Thanks, Oleg Grenrus) + + * Use `SmallArray#` instead of `Array#` for GHC versions 7.10 and above. + (Thanks, Dmitry Ivanov) + + * Adjust for `Semigroup => Monoid` proposal implementation. + (Thanks, Ryan Scott) + +### Bug fixes + + * Fix a strictness bug in `fromListWith`. + + * Enable eager blackholing for pre-8.2 GHC versions to work around + a runtime system bug. (Thanks, Ben Gamari) + + * Avoid sketchy reimplementation of `ST` when compiling with recent + GHC. + +### Other changes + + * Remove support for GHC versions before 7.8. (Thanks, Dmitry Ivanov) + + * Add internal documentaton. (Thanks, Johan Tibell) + +## 0.2.8.0 + + * Add `Eq1/2`, `Show1/2`, `Read1` instances with `base-4.9` + + * `Eq (HashSet a)` doesn't require `Hashable a` anymore, only `Eq a`. + + * Add `Hashable1/2` with `hashable-1.2.6.0` + + * Add `differenceWith` function. + +## 0.2.7.2 + + * Don't use -fregs-graphs + + * Fix benchmark compilation on stack. + +## 0.2.7.1 + + * Fix linker error related to popcnt. + + * Haddock improvements. + + * Fix benchmark compilation when downloaded from Hackage. + +## 0.2.7.0 + + * Support criterion 1.1 + + * Add unionWithKey for hash maps. + +## 0.2.6.0 + + * Mark several modules as Trustworthy. + + * Add Hashable instances for HashMap and HashSet. + + * Add mapMaybe, mapMaybeWithKey, update, alter, and + intersectionWithKey. + + * Add roles. + + * Add Hashable and Semigroup instances. + +## 0.2.5.1 (2014-10-11) + + * Support base-4.8 diff --git a/Data/HashMap/Internal.hs b/Data/HashMap/Internal.hs new file mode 100644 index 0000000..90bcd7a --- /dev/null +++ b/Data/HashMap/Internal.hs @@ -0,0 +1,2453 @@ +{-# LANGUAGE BangPatterns #-} +{-# LANGUAGE CPP #-} +{-# LANGUAGE DeriveLift #-} +{-# LANGUAGE LambdaCase #-} +{-# LANGUAGE MagicHash #-} +{-# LANGUAGE PatternGuards #-} +{-# LANGUAGE RoleAnnotations #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE TemplateHaskellQuotes #-} +{-# LANGUAGE TypeFamilies #-} +{-# LANGUAGE TypeInType #-} +{-# LANGUAGE UnboxedSums #-} +{-# LANGUAGE UnboxedTuples #-} +{-# OPTIONS_GHC -fno-full-laziness -funbox-strict-fields #-} +{-# OPTIONS_HADDOCK not-home #-} + +-- | = WARNING +-- +-- This module is considered __internal__. +-- +-- The Package Versioning Policy __does not apply__. +-- +-- The contents of this module may change __in any way whatsoever__ +-- and __without any warning__ between minor versions of this package. +-- +-- Authors importing this module are expected to track development +-- closely. + +module Data.HashMap.Internal + ( + HashMap(..) + , Leaf(..) + + -- * Construction + , empty + , singleton + + -- * Basic interface + , null + , size + , member + , lookup + , (!?) + , findWithDefault + , lookupDefault + , (!) + , insert + , insertWith + , unsafeInsert + , delete + , adjust + , update + , alter + , alterF + , isSubmapOf + , isSubmapOfBy + + -- * Combine + -- ** Union + , union + , unionWith + , unionWithKey + , unions + + -- ** Compose + , compose + + -- * Transformations + , map + , mapWithKey + , traverseWithKey + , mapKeys + + -- * Difference and intersection + , difference + , differenceWith + , intersection + , intersectionWith + , intersectionWithKey + , intersectionWithKey# + + -- * Folds + , foldr' + , foldl' + , foldrWithKey' + , foldlWithKey' + , foldr + , foldl + , foldrWithKey + , foldlWithKey + , foldMapWithKey + + -- * Filter + , mapMaybe + , mapMaybeWithKey + , filter + , filterWithKey + + -- * Conversions + , keys + , elems + + -- ** Lists + , toList + , fromList + , fromListWith + , fromListWithKey + + -- Internals used by the strict version + , Hash + , Bitmap + , bitmapIndexedOrFull + , collision + , hash + , mask + , index + , bitsPerSubkey + , fullNodeMask + , sparseIndex + , two + , unionArrayBy + , update32 + , update32M + , update32With' + , updateOrConcatWithKey + , filterMapAux + , equalKeys + , equalKeys1 + , lookupRecordCollision + , LookupRes(..) + , insert' + , delete' + , lookup' + , insertNewKey + , insertKeyExists + , deleteKeyExists + , insertModifying + , ptrEq + , adjust# + ) where + +import Control.Applicative (Const (..)) +import Control.DeepSeq (NFData (..), NFData1 (..), NFData2 (..)) +import Control.Monad.ST (ST, runST) +import Data.Bifoldable (Bifoldable (..)) +import Data.Bits (complement, countTrailingZeros, popCount, + shiftL, unsafeShiftL, unsafeShiftR, (.&.), + (.|.)) +import Data.Coerce (coerce) +import Data.Data (Constr, Data (..), DataType) +import Data.Functor.Classes (Eq1 (..), Eq2 (..), Ord1 (..), Ord2 (..), + Read1 (..), Show1 (..), Show2 (..)) +import Data.Functor.Identity (Identity (..)) +import Data.Hashable (Hashable) +import Data.Hashable.Lifted (Hashable1, Hashable2) +import Data.HashMap.Internal.List (isPermutationBy, unorderedCompare) +import Data.Semigroup (Semigroup (..), stimesIdempotentMonoid) +import GHC.Exts (Int (..), Int#, TYPE, (==#)) +import GHC.Stack (HasCallStack) +import Prelude hiding (filter, foldl, foldr, lookup, map, + null, pred) +import Text.Read hiding (step) + +import qualified Data.Data as Data +import qualified Data.Foldable as Foldable +import qualified Data.Functor.Classes as FC +import qualified Data.Hashable as H +import qualified Data.Hashable.Lifted as H +import qualified Data.HashMap.Internal.Array as A +import qualified Data.List as List +import qualified GHC.Exts as Exts +import qualified Language.Haskell.TH.Syntax as TH + +-- | A set of values. A set cannot contain duplicate values. +------------------------------------------------------------------------ + +-- | Convenience function. Compute a hash value for the given value. +hash :: H.Hashable a => a -> Hash +hash = fromIntegral . H.hash + +data Leaf k v = L !k v + deriving (Eq) + +instance (NFData k, NFData v) => NFData (Leaf k v) where + rnf (L k v) = rnf k `seq` rnf v + +-- | @since 0.2.17.0 +instance (TH.Lift k, TH.Lift v) => TH.Lift (Leaf k v) where +#if MIN_VERSION_template_haskell(2,16,0) + liftTyped (L k v) = [|| L k $! v ||] +#else + lift (L k v) = [| L k $! v |] +#endif + +-- | @since 0.2.14.0 +instance NFData k => NFData1 (Leaf k) where + liftRnf = liftRnf2 rnf + +-- | @since 0.2.14.0 +instance NFData2 Leaf where + liftRnf2 rnf1 rnf2 (L k v) = rnf1 k `seq` rnf2 v + +-- Invariant: The length of the 1st argument to 'Full' is +-- 2^bitsPerSubkey + +-- | A map from keys to values. A map cannot contain duplicate keys; +-- each key can map to at most one value. +data HashMap k v + = Empty + | BitmapIndexed !Bitmap !(A.Array (HashMap k v)) + | Leaf !Hash !(Leaf k v) + | Full !(A.Array (HashMap k v)) + | Collision !Hash !(A.Array (Leaf k v)) + +type role HashMap nominal representational + +-- | @since 0.2.17.0 +deriving instance (TH.Lift k, TH.Lift v) => TH.Lift (HashMap k v) + +instance (NFData k, NFData v) => NFData (HashMap k v) where + rnf Empty = () + rnf (BitmapIndexed _ ary) = rnf ary + rnf (Leaf _ l) = rnf l + rnf (Full ary) = rnf ary + rnf (Collision _ ary) = rnf ary + +-- | @since 0.2.14.0 +instance NFData k => NFData1 (HashMap k) where + liftRnf = liftRnf2 rnf + +-- | @since 0.2.14.0 +instance NFData2 HashMap where + liftRnf2 _ _ Empty = () + liftRnf2 rnf1 rnf2 (BitmapIndexed _ ary) = liftRnf (liftRnf2 rnf1 rnf2) ary + liftRnf2 rnf1 rnf2 (Leaf _ l) = liftRnf2 rnf1 rnf2 l + liftRnf2 rnf1 rnf2 (Full ary) = liftRnf (liftRnf2 rnf1 rnf2) ary + liftRnf2 rnf1 rnf2 (Collision _ ary) = liftRnf (liftRnf2 rnf1 rnf2) ary + +instance Functor (HashMap k) where + fmap = map + +instance Foldable.Foldable (HashMap k) where + foldMap f = foldMapWithKey (\ _k v -> f v) + {-# INLINE foldMap #-} + foldr = foldr + {-# INLINE foldr #-} + foldl = foldl + {-# INLINE foldl #-} + foldr' = foldr' + {-# INLINE foldr' #-} + foldl' = foldl' + {-# INLINE foldl' #-} + null = null + {-# INLINE null #-} + length = size + {-# INLINE length #-} + +-- | @since 0.2.11 +instance Bifoldable HashMap where + bifoldMap f g = foldMapWithKey (\ k v -> f k `mappend` g v) + {-# INLINE bifoldMap #-} + bifoldr f g = foldrWithKey (\ k v acc -> k `f` (v `g` acc)) + {-# INLINE bifoldr #-} + bifoldl f g = foldlWithKey (\ acc k v -> (acc `f` k) `g` v) + {-# INLINE bifoldl #-} + +-- | '<>' = 'union' +-- +-- If a key occurs in both maps, the mapping from the first will be the mapping in the result. +-- +-- ==== __Examples__ +-- +-- >>> fromList [(1,'a'),(2,'b')] <> fromList [(2,'c'),(3,'d')] +-- fromList [(1,'a'),(2,'b'),(3,'d')] +instance (Eq k, Hashable k) => Semigroup (HashMap k v) where + (<>) = union + {-# INLINE (<>) #-} + stimes = stimesIdempotentMonoid + {-# INLINE stimes #-} + +-- | 'mempty' = 'empty' +-- +-- 'mappend' = 'union' +-- +-- If a key occurs in both maps, the mapping from the first will be the mapping in the result. +-- +-- ==== __Examples__ +-- +-- >>> mappend (fromList [(1,'a'),(2,'b')]) (fromList [(2,'c'),(3,'d')]) +-- fromList [(1,'a'),(2,'b'),(3,'d')] +instance (Eq k, Hashable k) => Monoid (HashMap k v) where + mempty = empty + {-# INLINE mempty #-} + mappend = (<>) + {-# INLINE mappend #-} + +instance (Data k, Data v, Eq k, Hashable k) => Data (HashMap k v) where + gfoldl f z m = z fromList `f` toList m + toConstr _ = fromListConstr + gunfold k z c = case Data.constrIndex c of + 1 -> k (z fromList) + _ -> error "gunfold" + dataTypeOf _ = hashMapDataType + dataCast1 f = Data.gcast1 f + dataCast2 f = Data.gcast2 f + +fromListConstr :: Constr +fromListConstr = Data.mkConstr hashMapDataType "fromList" [] Data.Prefix + +hashMapDataType :: DataType +hashMapDataType = Data.mkDataType "Data.HashMap.Internal.HashMap" [fromListConstr] + +-- | This type is used to store the hash of a key, as produced with 'hash'. +type Hash = Word + +-- | A bitmap as contained by a 'BitmapIndexed' node, or a 'fullNodeMask' +-- corresponding to a 'Full' node. +-- +-- Only the lower 'maxChildren' bits are used. The remaining bits must be zeros. +type Bitmap = Word + +-- | 'Shift' values correspond to the level of the tree that we're currently +-- operating at. At the root level the 'Shift' is @0@. For the subsequent +-- levels the 'Shift' values are 'bitsPerSubkey', @2*'bitsPerSubkey'@ etc. +-- +-- Valid values are non-negative and less than @bitSize (0 :: Word)@. +type Shift = Int + +instance Show2 HashMap where + liftShowsPrec2 spk slk spv slv d m = + FC.showsUnaryWith (liftShowsPrec sp sl) "fromList" d (toList m) + where + sp = liftShowsPrec2 spk slk spv slv + sl = liftShowList2 spk slk spv slv + +instance Show k => Show1 (HashMap k) where + liftShowsPrec = liftShowsPrec2 showsPrec showList + +instance (Eq k, Hashable k, Read k) => Read1 (HashMap k) where + liftReadsPrec rp rl = FC.readsData $ + FC.readsUnaryWith (liftReadsPrec rp' rl') "fromList" fromList + where + rp' = liftReadsPrec rp rl + rl' = liftReadList rp rl + +instance (Eq k, Hashable k, Read k, Read e) => Read (HashMap k e) where + readPrec = parens $ prec 10 $ do + Ident "fromList" <- lexP + fromList <$> readPrec + + readListPrec = readListPrecDefault + +instance (Show k, Show v) => Show (HashMap k v) where + showsPrec d m = showParen (d > 10) $ + showString "fromList " . shows (toList m) + +instance Traversable (HashMap k) where + traverse f = traverseWithKey (const f) + {-# INLINABLE traverse #-} + +instance Eq2 HashMap where + liftEq2 = equal2 + +instance Eq k => Eq1 (HashMap k) where + liftEq = equal1 + +-- | Note that, in the presence of hash collisions, equal @HashMap@s may +-- behave differently, i.e. substitutivity may be violated: +-- +-- >>> data D = A | B deriving (Eq, Show) +-- >>> instance Hashable D where hashWithSalt salt _d = salt +-- +-- >>> x = fromList [(A,1), (B,2)] +-- >>> y = fromList [(B,2), (A,1)] +-- +-- >>> x == y +-- True +-- >>> toList x +-- [(A,1),(B,2)] +-- >>> toList y +-- [(B,2),(A,1)] +-- +-- In general, the lack of substitutivity can be observed with any function +-- that depends on the key ordering, such as folds and traversals. +instance (Eq k, Eq v) => Eq (HashMap k v) where + (==) = equal1 (==) + +-- We rely on there being no Empty constructors in the tree! +-- This ensures that two equal HashMaps will have the same +-- shape, modulo the order of entries in Collisions. +equal1 :: Eq k + => (v -> v' -> Bool) + -> HashMap k v -> HashMap k v' -> Bool +equal1 eq = go + where + go Empty Empty = True + go (BitmapIndexed bm1 ary1) (BitmapIndexed bm2 ary2) + = bm1 == bm2 && A.sameArray1 go ary1 ary2 + go (Leaf h1 l1) (Leaf h2 l2) = h1 == h2 && leafEq l1 l2 + go (Full ary1) (Full ary2) = A.sameArray1 go ary1 ary2 + go (Collision h1 ary1) (Collision h2 ary2) + = h1 == h2 && isPermutationBy leafEq (A.toList ary1) (A.toList ary2) + go _ _ = False + + leafEq (L k1 v1) (L k2 v2) = k1 == k2 && eq v1 v2 + +equal2 :: (k -> k' -> Bool) -> (v -> v' -> Bool) + -> HashMap k v -> HashMap k' v' -> Bool +equal2 eqk eqv t1 t2 = go (leavesAndCollisions t1 []) (leavesAndCollisions t2 []) + where + -- If the two trees are the same, then their lists of 'Leaf's and + -- 'Collision's read from left to right should be the same (modulo the + -- order of elements in 'Collision'). + + go (Leaf k1 l1 : tl1) (Leaf k2 l2 : tl2) + | k1 == k2 && + leafEq l1 l2 + = go tl1 tl2 + go (Collision k1 ary1 : tl1) (Collision k2 ary2 : tl2) + | k1 == k2 && + A.length ary1 == A.length ary2 && + isPermutationBy leafEq (A.toList ary1) (A.toList ary2) + = go tl1 tl2 + go [] [] = True + go _ _ = False + + leafEq (L k v) (L k' v') = eqk k k' && eqv v v' + +instance Ord2 HashMap where + liftCompare2 = cmp + +instance Ord k => Ord1 (HashMap k) where + liftCompare = cmp compare + +-- | The ordering is total and consistent with the `Eq` instance. However, +-- nothing else about the ordering is specified, and it may change from +-- version to version of either this package or of hashable. +instance (Ord k, Ord v) => Ord (HashMap k v) where + compare = cmp compare compare + +cmp :: (k -> k' -> Ordering) -> (v -> v' -> Ordering) + -> HashMap k v -> HashMap k' v' -> Ordering +cmp cmpk cmpv t1 t2 = go (leavesAndCollisions t1 []) (leavesAndCollisions t2 []) + where + go (Leaf k1 l1 : tl1) (Leaf k2 l2 : tl2) + = compare k1 k2 `mappend` + leafCompare l1 l2 `mappend` + go tl1 tl2 + go (Collision k1 ary1 : tl1) (Collision k2 ary2 : tl2) + = compare k1 k2 `mappend` + compare (A.length ary1) (A.length ary2) `mappend` + unorderedCompare leafCompare (A.toList ary1) (A.toList ary2) `mappend` + go tl1 tl2 + go (Leaf _ _ : _) (Collision _ _ : _) = LT + go (Collision _ _ : _) (Leaf _ _ : _) = GT + go [] [] = EQ + go [] _ = LT + go _ [] = GT + go _ _ = error "cmp: Should never happen, leavesAndCollisions includes non Leaf / Collision" + + leafCompare (L k v) (L k' v') = cmpk k k' `mappend` cmpv v v' + +-- Same as 'equal2' but doesn't compare the values. +equalKeys1 :: (k -> k' -> Bool) -> HashMap k v -> HashMap k' v' -> Bool +equalKeys1 eq t1 t2 = go (leavesAndCollisions t1 []) (leavesAndCollisions t2 []) + where + go (Leaf k1 l1 : tl1) (Leaf k2 l2 : tl2) + | k1 == k2 && leafEq l1 l2 + = go tl1 tl2 + go (Collision k1 ary1 : tl1) (Collision k2 ary2 : tl2) + | k1 == k2 && A.length ary1 == A.length ary2 && + isPermutationBy leafEq (A.toList ary1) (A.toList ary2) + = go tl1 tl2 + go [] [] = True + go _ _ = False + + leafEq (L k _) (L k' _) = eq k k' + +-- Same as 'equal1' but doesn't compare the values. +equalKeys :: Eq k => HashMap k v -> HashMap k v' -> Bool +equalKeys = go + where + go :: Eq k => HashMap k v -> HashMap k v' -> Bool + go Empty Empty = True + go (BitmapIndexed bm1 ary1) (BitmapIndexed bm2 ary2) + = bm1 == bm2 && A.sameArray1 go ary1 ary2 + go (Leaf h1 l1) (Leaf h2 l2) = h1 == h2 && leafEq l1 l2 + go (Full ary1) (Full ary2) = A.sameArray1 go ary1 ary2 + go (Collision h1 ary1) (Collision h2 ary2) + = h1 == h2 && isPermutationBy leafEq (A.toList ary1) (A.toList ary2) + go _ _ = False + + leafEq (L k1 _) (L k2 _) = k1 == k2 + +instance Hashable2 HashMap where + liftHashWithSalt2 hk hv salt hm = go salt (leavesAndCollisions hm []) + where + -- go :: Int -> [HashMap k v] -> Int + go s [] = s + go s (Leaf _ l : tl) + = s `hashLeafWithSalt` l `go` tl + -- For collisions we hashmix hash value + -- and then array of values' hashes sorted + go s (Collision h a : tl) + = (s `H.hashWithSalt` h) `hashCollisionWithSalt` a `go` tl + go s (_ : tl) = s `go` tl + + -- hashLeafWithSalt :: Int -> Leaf k v -> Int + hashLeafWithSalt s (L k v) = (s `hk` k) `hv` v + + -- hashCollisionWithSalt :: Int -> A.Array (Leaf k v) -> Int + hashCollisionWithSalt s + = List.foldl' H.hashWithSalt s . arrayHashesSorted s + + -- arrayHashesSorted :: Int -> A.Array (Leaf k v) -> [Int] + arrayHashesSorted s = List.sort . List.map (hashLeafWithSalt s) . A.toList + +instance (Hashable k) => Hashable1 (HashMap k) where + liftHashWithSalt = H.liftHashWithSalt2 H.hashWithSalt + +instance (Hashable k, Hashable v) => Hashable (HashMap k v) where + hashWithSalt salt hm = go salt hm + where + go :: Int -> HashMap k v -> Int + go s Empty = s + go s (BitmapIndexed _ a) = A.foldl' go s a + go s (Leaf h (L _ v)) + = s `H.hashWithSalt` h `H.hashWithSalt` v + -- For collisions we hashmix hash value + -- and then array of values' hashes sorted + go s (Full a) = A.foldl' go s a + go s (Collision h a) + = (s `H.hashWithSalt` h) `hashCollisionWithSalt` a + + hashLeafWithSalt :: Int -> Leaf k v -> Int + hashLeafWithSalt s (L k v) = s `H.hashWithSalt` k `H.hashWithSalt` v + + hashCollisionWithSalt :: Int -> A.Array (Leaf k v) -> Int + hashCollisionWithSalt s + = List.foldl' H.hashWithSalt s . arrayHashesSorted s + + arrayHashesSorted :: Int -> A.Array (Leaf k v) -> [Int] + arrayHashesSorted s = List.sort . List.map (hashLeafWithSalt s) . A.toList + +-- | Helper to get 'Leaf's and 'Collision's as a list. +leavesAndCollisions :: HashMap k v -> [HashMap k v] -> [HashMap k v] +leavesAndCollisions (BitmapIndexed _ ary) a = A.foldr leavesAndCollisions a ary +leavesAndCollisions (Full ary) a = A.foldr leavesAndCollisions a ary +leavesAndCollisions l@(Leaf _ _) a = l : a +leavesAndCollisions c@(Collision _ _) a = c : a +leavesAndCollisions Empty a = a + +-- | Helper function to detect 'Leaf's and 'Collision's. +isLeafOrCollision :: HashMap k v -> Bool +isLeafOrCollision (Leaf _ _) = True +isLeafOrCollision (Collision _ _) = True +isLeafOrCollision _ = False + +------------------------------------------------------------------------ +-- * Construction + +-- | \(O(1)\) Construct an empty map. +empty :: HashMap k v +empty = Empty + +-- | \(O(1)\) Construct a map with a single element. +singleton :: (Hashable k) => k -> v -> HashMap k v +singleton k v = Leaf (hash k) (L k v) + +------------------------------------------------------------------------ +-- * Basic interface + +-- | \(O(1)\) Return 'True' if this map is empty, 'False' otherwise. +null :: HashMap k v -> Bool +null Empty = True +null _ = False + +-- | \(O(n)\) Return the number of key-value mappings in this map. +size :: HashMap k v -> Int +size t = go t 0 + where + go Empty !n = n + go (Leaf _ _) n = n + 1 + go (BitmapIndexed _ ary) n = A.foldl' (flip go) n ary + go (Full ary) n = A.foldl' (flip go) n ary + go (Collision _ ary) n = n + A.length ary + +-- | \(O(\log n)\) Return 'True' if the specified key is present in the +-- map, 'False' otherwise. +member :: (Eq k, Hashable k) => k -> HashMap k a -> Bool +member k m = case lookup k m of + Nothing -> False + Just _ -> True +{-# INLINABLE member #-} + +-- | \(O(\log n)\) Return the value to which the specified key is mapped, +-- or 'Nothing' if this map contains no mapping for the key. +lookup :: (Eq k, Hashable k) => k -> HashMap k v -> Maybe v +-- GHC does not yet perform a worker-wrapper transformation on +-- unboxed sums automatically. That seems likely to happen at some +-- point (possibly as early as GHC 8.6) but for now we do it manually. +lookup k m = case lookup# k m of + (# (# #) | #) -> Nothing + (# | a #) -> Just a +{-# INLINE lookup #-} + +lookup# :: (Eq k, Hashable k) => k -> HashMap k v -> (# (# #) | v #) +lookup# k m = lookupCont (\_ -> (# (# #) | #)) (\v _i -> (# | v #)) (hash k) k 0 m +{-# INLINABLE lookup# #-} + +-- | lookup' is a version of lookup that takes the hash separately. +-- It is used to implement alterF. +lookup' :: Eq k => Hash -> k -> HashMap k v -> Maybe v +-- GHC does not yet perform a worker-wrapper transformation on +-- unboxed sums automatically. That seems likely to happen at some +-- point (possibly as early as GHC 8.6) but for now we do it manually. +-- lookup' would probably prefer to be implemented in terms of its own +-- lookup'#, but it's not important enough and we don't want too much +-- code. +lookup' h k m = case lookupRecordCollision# h k m of + (# (# #) | #) -> Nothing + (# | (# a, _i #) #) -> Just a +{-# INLINE lookup' #-} + +-- The result of a lookup, keeping track of if a hash collision occured. +-- If a collision did not occur then it will have the Int value (-1). +data LookupRes a = Absent | Present a !Int + +-- Internal helper for lookup. This version takes the precomputed hash so +-- that functions that make multiple calls to lookup and related functions +-- (insert, delete) only need to calculate the hash once. +-- +-- It is used by 'alterF' so that hash computation and key comparison only needs +-- to be performed once. With this information you can use the more optimized +-- versions of insert ('insertNewKey', 'insertKeyExists') and delete +-- ('deleteKeyExists') +-- +-- Outcomes: +-- Key not in map => Absent +-- Key in map, no collision => Present v (-1) +-- Key in map, collision => Present v position +lookupRecordCollision :: Eq k => Hash -> k -> HashMap k v -> LookupRes v +lookupRecordCollision h k m = case lookupRecordCollision# h k m of + (# (# #) | #) -> Absent + (# | (# a, i #) #) -> Present a (I# i) -- GHC will eliminate the I# +{-# INLINE lookupRecordCollision #-} + +-- Why do we produce an Int# instead of an Int? Unfortunately, GHC is not +-- yet any good at unboxing things *inside* products, let alone sums. That +-- may be changing in GHC 8.6 or so (there is some work in progress), but +-- for now we use Int# explicitly here. We don't need to push the Int# +-- into lookupCont because inlining takes care of that. +lookupRecordCollision# :: Eq k => Hash -> k -> HashMap k v -> (# (# #) | (# v, Int# #) #) +lookupRecordCollision# h k m = + lookupCont (\_ -> (# (# #) | #)) (\v (I# i) -> (# | (# v, i #) #)) h k 0 m +-- INLINABLE to specialize to the Eq instance. +{-# INLINABLE lookupRecordCollision# #-} + +-- A two-continuation version of lookupRecordCollision. This lets us +-- share source code between lookup and lookupRecordCollision without +-- risking any performance degradation. +-- +-- The absent continuation has type @((# #) -> r)@ instead of just @r@ +-- so we can be representation-polymorphic in the result type. Since +-- this whole thing is always inlined, we don't have to worry about +-- any extra CPS overhead. +-- +-- The @Int@ argument is the offset of the subkey in the hash. When looking up +-- keys at the top-level of a hashmap, the offset should be 0. When looking up +-- keys at level @n@ of a hashmap, the offset should be @n * bitsPerSubkey@. +lookupCont :: + forall rep (r :: TYPE rep) k v. + Eq k + => ((# #) -> r) -- Absent continuation + -> (v -> Int -> r) -- Present continuation + -> Hash -- The hash of the key + -> k + -> Int -- The offset of the subkey in the hash. + -> HashMap k v -> r +lookupCont absent present !h0 !k0 !s0 !m0 = go h0 k0 s0 m0 + where + go :: Eq k => Hash -> k -> Int -> HashMap k v -> r + go !_ !_ !_ Empty = absent (# #) + go h k _ (Leaf hx (L kx x)) + | h == hx && k == kx = present x (-1) + | otherwise = absent (# #) + go h k s (BitmapIndexed b v) + | b .&. m == 0 = absent (# #) + | otherwise = + go h k (s+bitsPerSubkey) (A.index v (sparseIndex b m)) + where m = mask h s + go h k s (Full v) = + go h k (s+bitsPerSubkey) (A.index v (index h s)) + go h k _ (Collision hx v) + | h == hx = lookupInArrayCont absent present k v + | otherwise = absent (# #) +{-# INLINE lookupCont #-} + +-- | \(O(\log n)\) Return the value to which the specified key is mapped, +-- or 'Nothing' if this map contains no mapping for the key. +-- +-- This is a flipped version of 'lookup'. +-- +-- @since 0.2.11 +(!?) :: (Eq k, Hashable k) => HashMap k v -> k -> Maybe v +(!?) m k = lookup k m +{-# INLINE (!?) #-} + + +-- | \(O(\log n)\) Return the value to which the specified key is mapped, +-- or the default value if this map contains no mapping for the key. +-- +-- @since 0.2.11 +findWithDefault :: (Eq k, Hashable k) + => v -- ^ Default value to return. + -> k -> HashMap k v -> v +findWithDefault def k t = case lookup k t of + Just v -> v + _ -> def +{-# INLINABLE findWithDefault #-} + + +-- | \(O(\log n)\) Return the value to which the specified key is mapped, +-- or the default value if this map contains no mapping for the key. +-- +-- DEPRECATED: lookupDefault is deprecated as of version 0.2.11, replaced +-- by 'findWithDefault'. +lookupDefault :: (Eq k, Hashable k) + => v -- ^ Default value to return. + -> k -> HashMap k v -> v +lookupDefault = findWithDefault +{-# INLINE lookupDefault #-} + +-- | \(O(\log n)\) Return the value to which the specified key is mapped. +-- Calls 'error' if this map contains no mapping for the key. +(!) :: (Eq k, Hashable k, HasCallStack) => HashMap k v -> k -> v +(!) m k = case lookup k m of + Just v -> v + Nothing -> error "Data.HashMap.Internal.(!): key not found" +{-# INLINABLE (!) #-} + +infixl 9 ! + +-- | Create a 'Collision' value with two 'Leaf' values. +collision :: Hash -> Leaf k v -> Leaf k v -> HashMap k v +collision h !e1 !e2 = + let v = A.run $ do mary <- A.new 2 e1 + A.write mary 1 e2 + return mary + in Collision h v +{-# INLINE collision #-} + +-- | Create a 'BitmapIndexed' or 'Full' node. +bitmapIndexedOrFull :: Bitmap -> A.Array (HashMap k v) -> HashMap k v +-- The strictness in @ary@ helps achieve a nice code size reduction in +-- @unionWith[Key]@ with GHC 9.2.2. See the Core diffs in +-- https://github.com/haskell-unordered-containers/unordered-containers/pull/376. +bitmapIndexedOrFull b !ary + | b == fullNodeMask = Full ary + | otherwise = BitmapIndexed b ary +{-# INLINE bitmapIndexedOrFull #-} + +-- | \(O(\log n)\) Associate the specified value with the specified +-- key in this map. If this map previously contained a mapping for +-- the key, the old value is replaced. +insert :: (Eq k, Hashable k) => k -> v -> HashMap k v -> HashMap k v +insert k v m = insert' (hash k) k v m +{-# INLINABLE insert #-} + +insert' :: Eq k => Hash -> k -> v -> HashMap k v -> HashMap k v +insert' h0 k0 v0 m0 = go h0 k0 v0 0 m0 + where + go !h !k x !_ Empty = Leaf h (L k x) + go h k x s t@(Leaf hy l@(L ky y)) + | hy == h = if ky == k + then if x `ptrEq` y + then t + else Leaf h (L k x) + else collision h l (L k x) + | otherwise = runST (two s h k x hy t) + go h k x s t@(BitmapIndexed b ary) + | b .&. m == 0 = + let !ary' = A.insert ary i $! Leaf h (L k x) + in bitmapIndexedOrFull (b .|. m) ary' + | otherwise = + let !st = A.index ary i + !st' = go h k x (s+bitsPerSubkey) st + in if st' `ptrEq` st + then t + else BitmapIndexed b (A.update ary i st') + where m = mask h s + i = sparseIndex b m + go h k x s t@(Full ary) = + let !st = A.index ary i + !st' = go h k x (s+bitsPerSubkey) st + in if st' `ptrEq` st + then t + else Full (update32 ary i st') + where i = index h s + go h k x s t@(Collision hy v) + | h == hy = Collision h (updateOrSnocWith (\a _ -> (# a #)) k x v) + | otherwise = go h k x s $ BitmapIndexed (mask hy s) (A.singleton t) +{-# INLINABLE insert' #-} + +-- Insert optimized for the case when we know the key is not in the map. +-- +-- It is only valid to call this when the key does not exist in the map. +-- +-- We can skip: +-- - the key equality check on a Leaf +-- - check for its existence in the array for a hash collision +insertNewKey :: Hash -> k -> v -> HashMap k v -> HashMap k v +insertNewKey !h0 !k0 x0 !m0 = go h0 k0 x0 0 m0 + where + go !h !k x !_ Empty = Leaf h (L k x) + go h k x s t@(Leaf hy l) + | hy == h = collision h l (L k x) + | otherwise = runST (two s h k x hy t) + go h k x s (BitmapIndexed b ary) + | b .&. m == 0 = + let !ary' = A.insert ary i $! Leaf h (L k x) + in bitmapIndexedOrFull (b .|. m) ary' + | otherwise = + let !st = A.index ary i + !st' = go h k x (s+bitsPerSubkey) st + in BitmapIndexed b (A.update ary i st') + where m = mask h s + i = sparseIndex b m + go h k x s (Full ary) = + let !st = A.index ary i + !st' = go h k x (s+bitsPerSubkey) st + in Full (update32 ary i st') + where i = index h s + go h k x s t@(Collision hy v) + | h == hy = Collision h (A.snoc v (L k x)) + | otherwise = + go h k x s $ BitmapIndexed (mask hy s) (A.singleton t) +{-# NOINLINE insertNewKey #-} + + +-- Insert optimized for the case when we know the key is in the map. +-- +-- It is only valid to call this when the key exists in the map and you know the +-- hash collision position if there was one. This information can be obtained +-- from 'lookupRecordCollision'. If there is no collision pass (-1) as collPos +-- (first argument). +-- +-- We can skip the key equality check on a Leaf because we know the leaf must be +-- for this key. +insertKeyExists :: Int -> Hash -> k -> v -> HashMap k v -> HashMap k v +insertKeyExists !collPos0 !h0 !k0 x0 !m0 = go collPos0 h0 k0 x0 0 m0 + where + go !_collPos !h !k x !_s (Leaf _hy _kx) + = Leaf h (L k x) + go collPos h k x s (BitmapIndexed b ary) + | b .&. m == 0 = + let !ary' = A.insert ary i $ Leaf h (L k x) + in bitmapIndexedOrFull (b .|. m) ary' + | otherwise = + let !st = A.index ary i + !st' = go collPos h k x (s+bitsPerSubkey) st + in BitmapIndexed b (A.update ary i st') + where m = mask h s + i = sparseIndex b m + go collPos h k x s (Full ary) = + let !st = A.index ary i + !st' = go collPos h k x (s+bitsPerSubkey) st + in Full (update32 ary i st') + where i = index h s + go collPos h k x _s (Collision _hy v) + | collPos >= 0 = Collision h (setAtPosition collPos k x v) + | otherwise = Empty -- error "Internal error: go {collPos negative}" + go _ _ _ _ _ Empty = Empty -- error "Internal error: go Empty" + +{-# NOINLINE insertKeyExists #-} + +-- Replace the ith Leaf with Leaf k v. +-- +-- This does not check that @i@ is within bounds of the array. +setAtPosition :: Int -> k -> v -> A.Array (Leaf k v) -> A.Array (Leaf k v) +setAtPosition i k x ary = A.update ary i (L k x) +{-# INLINE setAtPosition #-} + + +-- | In-place update version of insert +unsafeInsert :: (Eq k, Hashable k) => k -> v -> HashMap k v -> HashMap k v +unsafeInsert k0 v0 m0 = runST (go h0 k0 v0 0 m0) + where + h0 = hash k0 + go !h !k x !_ Empty = return $! Leaf h (L k x) + go h k x s t@(Leaf hy l@(L ky y)) + | hy == h = if ky == k + then if x `ptrEq` y + then return t + else return $! Leaf h (L k x) + else return $! collision h l (L k x) + | otherwise = two s h k x hy t + go h k x s t@(BitmapIndexed b ary) + | b .&. m == 0 = do + ary' <- A.insertM ary i $! Leaf h (L k x) + return $! bitmapIndexedOrFull (b .|. m) ary' + | otherwise = do + st <- A.indexM ary i + st' <- go h k x (s+bitsPerSubkey) st + A.unsafeUpdateM ary i st' + return t + where m = mask h s + i = sparseIndex b m + go h k x s t@(Full ary) = do + st <- A.indexM ary i + st' <- go h k x (s+bitsPerSubkey) st + A.unsafeUpdateM ary i st' + return t + where i = index h s + go h k x s t@(Collision hy v) + | h == hy = return $! Collision h (updateOrSnocWith (\a _ -> (# a #)) k x v) + | otherwise = go h k x s $ BitmapIndexed (mask hy s) (A.singleton t) +{-# INLINABLE unsafeInsert #-} + +-- | Create a map from two key-value pairs which hashes don't collide. To +-- enhance sharing, the second key-value pair is represented by the hash of its +-- key and a singleton HashMap pairing its key with its value. +-- +-- Note: to avoid silly thunks, this function must be strict in the +-- key. See issue #232. We don't need to force the HashMap argument +-- because it's already in WHNF (having just been matched) and we +-- just put it directly in an array. +two :: Shift -> Hash -> k -> v -> Hash -> HashMap k v -> ST s (HashMap k v) +two = go + where + go s h1 k1 v1 h2 t2 + | bp1 == bp2 = do + st <- go (s+bitsPerSubkey) h1 k1 v1 h2 t2 + ary <- A.singletonM st + return $ BitmapIndexed bp1 ary + | otherwise = do + mary <- A.new 2 $! Leaf h1 (L k1 v1) + A.write mary idx2 t2 + ary <- A.unsafeFreeze mary + return $ BitmapIndexed (bp1 .|. bp2) ary + where + bp1 = mask h1 s + bp2 = mask h2 s + idx2 | index h1 s < index h2 s = 1 + | otherwise = 0 +{-# INLINE two #-} + +-- | \(O(\log n)\) Associate the value with the key in this map. If +-- this map previously contained a mapping for the key, the old value +-- is replaced by the result of applying the given function to the new +-- and old value. Example: +-- +-- > insertWith f k v map +-- > where f new old = new + old +insertWith :: (Eq k, Hashable k) => (v -> v -> v) -> k -> v -> HashMap k v + -> HashMap k v +-- We're not going to worry about allocating a function closure +-- to pass to insertModifying. See comments at 'adjust'. +insertWith f k new m = insertModifying new (\old -> (# f new old #)) k m +{-# INLINE insertWith #-} + +-- | @insertModifying@ is a lot like insertWith; we use it to implement alterF. +-- It takes a value to insert when the key is absent and a function +-- to apply to calculate a new value when the key is present. Thanks +-- to the unboxed unary tuple, we avoid introducing any unnecessary +-- thunks in the tree. +insertModifying :: (Eq k, Hashable k) => v -> (v -> (# v #)) -> k -> HashMap k v + -> HashMap k v +insertModifying x f k0 m0 = go h0 k0 0 m0 + where + !h0 = hash k0 + go !h !k !_ Empty = Leaf h (L k x) + go h k s t@(Leaf hy l@(L ky y)) + | hy == h = if ky == k + then case f y of + (# v' #) | ptrEq y v' -> t + | otherwise -> Leaf h (L k v') + else collision h l (L k x) + | otherwise = runST (two s h k x hy t) + go h k s t@(BitmapIndexed b ary) + | b .&. m == 0 = + let ary' = A.insert ary i $! Leaf h (L k x) + in bitmapIndexedOrFull (b .|. m) ary' + | otherwise = + let !st = A.index ary i + !st' = go h k (s+bitsPerSubkey) st + ary' = A.update ary i $! st' + in if ptrEq st st' + then t + else BitmapIndexed b ary' + where m = mask h s + i = sparseIndex b m + go h k s t@(Full ary) = + let !st = A.index ary i + !st' = go h k (s+bitsPerSubkey) st + ary' = update32 ary i $! st' + in if ptrEq st st' + then t + else Full ary' + where i = index h s + go h k s t@(Collision hy v) + | h == hy = + let !v' = insertModifyingArr x f k v + in if A.unsafeSameArray v v' + then t + else Collision h v' + | otherwise = go h k s $ BitmapIndexed (mask hy s) (A.singleton t) +{-# INLINABLE insertModifying #-} + +-- Like insertModifying for arrays; used to implement insertModifying +insertModifyingArr :: Eq k => v -> (v -> (# v #)) -> k -> A.Array (Leaf k v) + -> A.Array (Leaf k v) +insertModifyingArr x f k0 ary0 = go k0 ary0 0 (A.length ary0) + where + go !k !ary !i !n + -- Not found, append to the end. + | i >= n = A.snoc ary $ L k x + | otherwise = case A.index ary i of + (L kx y) | k == kx -> case f y of + (# y' #) -> if ptrEq y y' + then ary + else A.update ary i (L k y') + | otherwise -> go k ary (i+1) n +{-# INLINE insertModifyingArr #-} + +-- | In-place update version of insertWith +unsafeInsertWith :: forall k v. (Eq k, Hashable k) + => (v -> v -> v) -> k -> v -> HashMap k v + -> HashMap k v +unsafeInsertWith f k0 v0 m0 = unsafeInsertWithKey (\_ a b -> (# f a b #)) k0 v0 m0 +{-# INLINABLE unsafeInsertWith #-} + +unsafeInsertWithKey :: forall k v. (Eq k, Hashable k) + => (k -> v -> v -> (# v #)) -> k -> v -> HashMap k v + -> HashMap k v +unsafeInsertWithKey f k0 v0 m0 = runST (go h0 k0 v0 0 m0) + where + h0 = hash k0 + go :: Hash -> k -> v -> Shift -> HashMap k v -> ST s (HashMap k v) + go !h !k x !_ Empty = return $! Leaf h (L k x) + go h k x s t@(Leaf hy l@(L ky y)) + | hy == h = if ky == k + then case f k x y of + (# v #) -> return $! Leaf h (L k v) + else return $! collision h l (L k x) + | otherwise = two s h k x hy t + go h k x s t@(BitmapIndexed b ary) + | b .&. m == 0 = do + ary' <- A.insertM ary i $! Leaf h (L k x) + return $! bitmapIndexedOrFull (b .|. m) ary' + | otherwise = do + st <- A.indexM ary i + st' <- go h k x (s+bitsPerSubkey) st + A.unsafeUpdateM ary i st' + return t + where m = mask h s + i = sparseIndex b m + go h k x s t@(Full ary) = do + st <- A.indexM ary i + st' <- go h k x (s+bitsPerSubkey) st + A.unsafeUpdateM ary i st' + return t + where i = index h s + go h k x s t@(Collision hy v) + | h == hy = return $! Collision h (updateOrSnocWithKey f k x v) + | otherwise = go h k x s $ BitmapIndexed (mask hy s) (A.singleton t) +{-# INLINABLE unsafeInsertWithKey #-} + +-- | \(O(\log n)\) Remove the mapping for the specified key from this map +-- if present. +delete :: (Eq k, Hashable k) => k -> HashMap k v -> HashMap k v +delete k m = delete' (hash k) k m +{-# INLINABLE delete #-} + +delete' :: Eq k => Hash -> k -> HashMap k v -> HashMap k v +delete' h0 k0 m0 = go h0 k0 0 m0 + where + go !_ !_ !_ Empty = Empty + go h k _ t@(Leaf hy (L ky _)) + | hy == h && ky == k = Empty + | otherwise = t + go h k s t@(BitmapIndexed b ary) + | b .&. m == 0 = t + | otherwise = + let !st = A.index ary i + !st' = go h k (s+bitsPerSubkey) st + in if st' `ptrEq` st + then t + else case st' of + Empty | A.length ary == 1 -> Empty + | A.length ary == 2 -> + case (i, A.index ary 0, A.index ary 1) of + (0, _, l) | isLeafOrCollision l -> l + (1, l, _) | isLeafOrCollision l -> l + _ -> bIndexed + | otherwise -> bIndexed + where + bIndexed = BitmapIndexed (b .&. complement m) (A.delete ary i) + l | isLeafOrCollision l && A.length ary == 1 -> l + _ -> BitmapIndexed b (A.update ary i st') + where m = mask h s + i = sparseIndex b m + go h k s t@(Full ary) = + let !st = A.index ary i + !st' = go h k (s+bitsPerSubkey) st + in if st' `ptrEq` st + then t + else case st' of + Empty -> + let ary' = A.delete ary i + bm = fullNodeMask .&. complement (1 `unsafeShiftL` i) + in BitmapIndexed bm ary' + _ -> Full (A.update ary i st') + where i = index h s + go h k _ t@(Collision hy v) + | h == hy = case indexOf k v of + Just i + | A.length v == 2 -> + if i == 0 + then Leaf h (A.index v 1) + else Leaf h (A.index v 0) + | otherwise -> Collision h (A.delete v i) + Nothing -> t + | otherwise = t +{-# INLINABLE delete' #-} + +-- | Delete optimized for the case when we know the key is in the map. +-- +-- It is only valid to call this when the key exists in the map and you know the +-- hash collision position if there was one. This information can be obtained +-- from 'lookupRecordCollision'. If there is no collision pass (-1) as collPos. +-- +-- We can skip: +-- - the key equality check on the leaf, if we reach a leaf it must be the key +deleteKeyExists :: Int -> Hash -> k -> HashMap k v -> HashMap k v +deleteKeyExists !collPos0 !h0 !k0 !m0 = go collPos0 h0 k0 0 m0 + where + go :: Int -> Hash -> k -> Int -> HashMap k v -> HashMap k v + go !_collPos !_h !_k !_s (Leaf _ _) = Empty + go collPos h k s (BitmapIndexed b ary) = + let !st = A.index ary i + !st' = go collPos h k (s+bitsPerSubkey) st + in case st' of + Empty | A.length ary == 1 -> Empty + | A.length ary == 2 -> + case (i, A.index ary 0, A.index ary 1) of + (0, _, l) | isLeafOrCollision l -> l + (1, l, _) | isLeafOrCollision l -> l + _ -> bIndexed + | otherwise -> bIndexed + where + bIndexed = BitmapIndexed (b .&. complement m) (A.delete ary i) + l | isLeafOrCollision l && A.length ary == 1 -> l + _ -> BitmapIndexed b (A.update ary i st') + where m = mask h s + i = sparseIndex b m + go collPos h k s (Full ary) = + let !st = A.index ary i + !st' = go collPos h k (s+bitsPerSubkey) st + in case st' of + Empty -> + let ary' = A.delete ary i + bm = fullNodeMask .&. complement (1 `unsafeShiftL` i) + in BitmapIndexed bm ary' + _ -> Full (A.update ary i st') + where i = index h s + go collPos h _ _ (Collision _hy v) + | A.length v == 2 + = if collPos == 0 + then Leaf h (A.index v 1) + else Leaf h (A.index v 0) + | otherwise = Collision h (A.delete v collPos) + go !_ !_ !_ !_ Empty = Empty -- error "Internal error: deleteKeyExists empty" +{-# NOINLINE deleteKeyExists #-} + +-- | \(O(\log n)\) Adjust the value tied to a given key in this map only +-- if it is present. Otherwise, leave the map alone. +adjust :: (Eq k, Hashable k) => (v -> v) -> k -> HashMap k v -> HashMap k v +-- This operation really likes to leak memory, so using this +-- indirect implementation shouldn't hurt much. Furthermore, it allows +-- GHC to avoid a leak when the function is lazy. In particular, +-- +-- adjust (const x) k m +-- ==> adjust# (\v -> (# const x v #)) k m +-- ==> adjust# (\_ -> (# x #)) k m +adjust f k m = adjust# (\v -> (# f v #)) k m +{-# INLINE adjust #-} + +-- | Much like 'adjust', but not inherently leaky. +adjust# :: (Eq k, Hashable k) => (v -> (# v #)) -> k -> HashMap k v -> HashMap k v +adjust# f k0 m0 = go h0 k0 0 m0 + where + h0 = hash k0 + go !_ !_ !_ Empty = Empty + go h k _ t@(Leaf hy (L ky y)) + | hy == h && ky == k = case f y of + (# y' #) | ptrEq y y' -> t + | otherwise -> Leaf h (L k y') + | otherwise = t + go h k s t@(BitmapIndexed b ary) + | b .&. m == 0 = t + | otherwise = let !st = A.index ary i + !st' = go h k (s+bitsPerSubkey) st + ary' = A.update ary i $! st' + in if ptrEq st st' + then t + else BitmapIndexed b ary' + where m = mask h s + i = sparseIndex b m + go h k s t@(Full ary) = + let i = index h s + !st = A.index ary i + !st' = go h k (s+bitsPerSubkey) st + ary' = update32 ary i $! st' + in if ptrEq st st' + then t + else Full ary' + go h k _ t@(Collision hy v) + | h == hy = let !v' = updateWith# f k v + in if A.unsafeSameArray v v' + then t + else Collision h v' + | otherwise = t +{-# INLINABLE adjust# #-} + +-- | \(O(\log n)\) The expression @('update' f k map)@ updates the value @x@ at @k@ +-- (if it is in the map). If @(f x)@ is 'Nothing', the element is deleted. +-- If it is @('Just' y)@, the key @k@ is bound to the new value @y@. +update :: (Eq k, Hashable k) => (a -> Maybe a) -> k -> HashMap k a -> HashMap k a +update f = alter (>>= f) +{-# INLINABLE update #-} + + +-- | \(O(\log n)\) The expression @('alter' f k map)@ alters the value @x@ at @k@, or +-- absence thereof. +-- +-- 'alter' can be used to insert, delete, or update a value in a map. In short: +-- +-- @ +-- 'lookup' k ('alter' f k m) = f ('lookup' k m) +-- @ +alter :: (Eq k, Hashable k) => (Maybe v -> Maybe v) -> k -> HashMap k v -> HashMap k v +-- TODO(m-renaud): Consider using specialized insert and delete for alter. +alter f k m = + case f (lookup k m) of + Nothing -> delete k m + Just v -> insert k v m +{-# INLINABLE alter #-} + +-- | \(O(\log n)\) The expression @('alterF' f k map)@ alters the value @x@ at +-- @k@, or absence thereof. +-- +-- 'alterF' can be used to insert, delete, or update a value in a map. +-- +-- Note: 'alterF' is a flipped version of the 'at' combinator from +-- . +-- +-- @since 0.2.10 +alterF :: (Functor f, Eq k, Hashable k) + => (Maybe v -> f (Maybe v)) -> k -> HashMap k v -> f (HashMap k v) +-- We only calculate the hash once, but unless this is rewritten +-- by rules we may test for key equality multiple times. +-- We force the value of the map for consistency with the rewritten +-- version; otherwise someone could tell the difference using a lazy +-- @f@ and a functor that is similar to Const but not actually Const. +alterF f = \ !k !m -> + let + !h = hash k + mv = lookup' h k m + in (<$> f mv) $ \case + Nothing -> maybe m (const (delete' h k m)) mv + Just v' -> insert' h k v' m + +-- We unconditionally rewrite alterF in RULES, but we expose an +-- unfolding just in case it's used in some way that prevents the +-- rule from firing. +{-# INLINABLE [0] alterF #-} + +-- This is just a bottom value. See the comment on the "alterFWeird" +-- rule. +test_bottom :: a +test_bottom = error "Data.HashMap.alterF internal error: hit test_bottom" + +-- We use this as an error result in RULES to ensure we don't get +-- any useless CallStack nonsense. +bogus# :: (# #) -> (# a #) +bogus# _ = error "Data.HashMap.alterF internal error: hit bogus#" + +{-# RULES +-- We probe the behavior of @f@ by applying it to Nothing and to +-- Just test_bottom. Based on the results, and how they relate to +-- each other, we choose the best implementation. + +"alterFWeird" forall f. alterF f = + alterFWeird (f Nothing) (f (Just test_bottom)) f + +-- This rule covers situations where alterF is used to simply insert or +-- delete in Identity (most likely via Control.Lens.At). We recognize here +-- (through the repeated @x@ on the LHS) that +-- +-- @f Nothing = f (Just bottom)@, +-- +-- which guarantees that @f@ doesn't care what its argument is, so +-- we don't have to either. +-- +-- Why only Identity? A variant of this rule is actually valid regardless of +-- the functor, but for some functors (e.g., []), it can lead to the +-- same keys being compared multiple times, which is bad if they're +-- ugly things like strings. This is unfortunate, since the rule is likely +-- a good idea for almost all realistic uses, but I don't like nasty +-- edge cases. +"alterFconstant" forall (f :: Maybe a -> Identity (Maybe a)) x. + alterFWeird x x f = \ !k !m -> + Identity (case runIdentity x of {Nothing -> delete k m; Just a -> insert k a m}) + +-- This rule handles the case where 'alterF' is used to do 'insertWith'-like +-- things. Whenever possible, GHC will get rid of the Maybe nonsense for us. +-- We delay this rule to stage 1 so alterFconstant has a chance to fire. +"alterFinsertWith" [1] forall (f :: Maybe a -> Identity (Maybe a)) x y. + alterFWeird (coerce (Just x)) (coerce (Just y)) f = + coerce (insertModifying x (\mold -> case runIdentity (f (Just mold)) of + Nothing -> bogus# (# #) + Just new -> (# new #))) + +-- Handle the case where someone uses 'alterF' instead of 'adjust'. This +-- rule is kind of picky; it will only work if the function doesn't +-- do anything between case matching on the Maybe and producing a result. +"alterFadjust" forall (f :: Maybe a -> Identity (Maybe a)) _y. + alterFWeird (coerce Nothing) (coerce (Just _y)) f = + coerce (adjust# (\x -> case runIdentity (f (Just x)) of + Just x' -> (# x' #) + Nothing -> bogus# (# #))) + +-- The simple specialization to Const; in this case we can look up +-- the key without caring what position it's in. This is only a tiny +-- optimization. +"alterFlookup" forall _ign1 _ign2 (f :: Maybe a -> Const r (Maybe a)). + alterFWeird _ign1 _ign2 f = \ !k !m -> Const (getConst (f (lookup k m))) + #-} + +-- This is a very unsafe version of alterF used for RULES. When calling +-- alterFWeird x y f, the following *must* hold: +-- +-- x = f Nothing +-- y = f (Just _|_) +-- +-- Failure to abide by these laws will make demons come out of your nose. +alterFWeird + :: (Functor f, Eq k, Hashable k) + => f (Maybe v) + -> f (Maybe v) + -> (Maybe v -> f (Maybe v)) -> k -> HashMap k v -> f (HashMap k v) +alterFWeird _ _ f = alterFEager f +{-# INLINE [0] alterFWeird #-} + +-- | This is the default version of alterF that we use in most non-trivial +-- cases. It's called "eager" because it looks up the given key in the map +-- eagerly, whether or not the given function requires that information. +alterFEager :: (Functor f, Eq k, Hashable k) + => (Maybe v -> f (Maybe v)) -> k -> HashMap k v -> f (HashMap k v) +alterFEager f !k m = (<$> f mv) $ \case + + ------------------------------ + -- Delete the key from the map. + Nothing -> case lookupRes of + + -- Key did not exist in the map to begin with, no-op + Absent -> m + + -- Key did exist + Present _ collPos -> deleteKeyExists collPos h k m + + ------------------------------ + -- Update value + Just v' -> case lookupRes of + + -- Key did not exist before, insert v' under a new key + Absent -> insertNewKey h k v' m + + -- Key existed before + Present v collPos -> + if v `ptrEq` v' + -- If the value is identical, no-op + then m + -- If the value changed, update the value. + else insertKeyExists collPos h k v' m + + where !h = hash k + !lookupRes = lookupRecordCollision h k m + !mv = case lookupRes of + Absent -> Nothing + Present v _ -> Just v +{-# INLINABLE alterFEager #-} + +-- | \(O(n \log m)\) Inclusion of maps. A map is included in another map if the keys +-- are subsets and the corresponding values are equal: +-- +-- > isSubmapOf m1 m2 = keys m1 `isSubsetOf` keys m2 && +-- > and [ v1 == v2 | (k1,v1) <- toList m1; let v2 = m2 ! k1 ] +-- +-- ==== __Examples__ +-- +-- >>> fromList [(1,'a')] `isSubmapOf` fromList [(1,'a'),(2,'b')] +-- True +-- +-- >>> fromList [(1,'a'),(2,'b')] `isSubmapOf` fromList [(1,'a')] +-- False +-- +-- @since 0.2.12 +isSubmapOf :: (Eq k, Hashable k, Eq v) => HashMap k v -> HashMap k v -> Bool +isSubmapOf = Exts.inline isSubmapOfBy (==) +{-# INLINABLE isSubmapOf #-} + +-- | \(O(n \log m)\) Inclusion of maps with value comparison. A map is included in +-- another map if the keys are subsets and if the comparison function is true +-- for the corresponding values: +-- +-- > isSubmapOfBy cmpV m1 m2 = keys m1 `isSubsetOf` keys m2 && +-- > and [ v1 `cmpV` v2 | (k1,v1) <- toList m1; let v2 = m2 ! k1 ] +-- +-- ==== __Examples__ +-- +-- >>> isSubmapOfBy (<=) (fromList [(1,'a')]) (fromList [(1,'b'),(2,'c')]) +-- True +-- +-- >>> isSubmapOfBy (<=) (fromList [(1,'b')]) (fromList [(1,'a'),(2,'c')]) +-- False +-- +-- @since 0.2.12 +isSubmapOfBy :: (Eq k, Hashable k) => (v1 -> v2 -> Bool) -> HashMap k v1 -> HashMap k v2 -> Bool +-- For maps without collisions the complexity is O(n*log m), where n is the size +-- of m1 and m the size of m2: the inclusion operation visits every leaf in m1 at least once. +-- For each leaf in m1, it looks up the key in m2. +-- +-- The worst case complexity is O(n*m). The worst case is when both hashmaps m1 +-- and m2 are collision nodes for the same hash. Since collision nodes are +-- unsorted arrays, it requires for every key in m1 a linear search to to find a +-- matching key in m2, hence O(n*m). +isSubmapOfBy comp !m1 !m2 = go 0 m1 m2 + where + -- An empty map is always a submap of any other map. + go _ Empty _ = True + + -- If the second map is empty and the first is not, it cannot be a submap. + go _ _ Empty = False + + -- If the first map contains only one entry, lookup the key in the second map. + go s (Leaf h1 (L k1 v1)) t2 = lookupCont (\_ -> False) (\v2 _ -> comp v1 v2) h1 k1 s t2 + + -- In this case, we need to check that for each x in ls1, there is a y in + -- ls2 such that x `comp` y. This is the worst case complexity-wise since it + -- requires a O(m*n) check. + go _ (Collision h1 ls1) (Collision h2 ls2) = + h1 == h2 && subsetArray comp ls1 ls2 + + -- In this case, we only need to check the entries in ls2 with the hash h1. + go s t1@(Collision h1 _) (BitmapIndexed b ls2) + | b .&. m == 0 = False + | otherwise = + go (s+bitsPerSubkey) t1 (A.index ls2 (sparseIndex b m)) + where m = mask h1 s + + -- Similar to the previous case we need to traverse l2 at the index for the hash h1. + go s t1@(Collision h1 _) (Full ls2) = + go (s+bitsPerSubkey) t1 (A.index ls2 (index h1 s)) + + -- In cases where the first and second map are BitmapIndexed or Full, + -- traverse down the tree at the appropriate indices. + go s (BitmapIndexed b1 ls1) (BitmapIndexed b2 ls2) = + submapBitmapIndexed (go (s+bitsPerSubkey)) b1 ls1 b2 ls2 + go s (BitmapIndexed b1 ls1) (Full ls2) = + submapBitmapIndexed (go (s+bitsPerSubkey)) b1 ls1 fullNodeMask ls2 + go s (Full ls1) (Full ls2) = + submapBitmapIndexed (go (s+bitsPerSubkey)) fullNodeMask ls1 fullNodeMask ls2 + + -- Collision and Full nodes always contain at least two entries. Hence it + -- cannot be a map of a leaf. + go _ (Collision {}) (Leaf {}) = False + go _ (BitmapIndexed {}) (Leaf {}) = False + go _ (Full {}) (Leaf {}) = False + go _ (BitmapIndexed {}) (Collision {}) = False + go _ (Full {}) (Collision {}) = False + go _ (Full {}) (BitmapIndexed {}) = False +{-# INLINABLE isSubmapOfBy #-} + +-- | \(O(\min n m))\) Checks if a bitmap indexed node is a submap of another. +submapBitmapIndexed :: (HashMap k v1 -> HashMap k v2 -> Bool) -> Bitmap -> A.Array (HashMap k v1) -> Bitmap -> A.Array (HashMap k v2) -> Bool +submapBitmapIndexed comp !b1 !ary1 !b2 !ary2 = subsetBitmaps && go 0 0 (b1Orb2 .&. negate b1Orb2) + where + go :: Int -> Int -> Bitmap -> Bool + go !i !j !m + | m > b1Orb2 = True + + -- In case a key is both in ary1 and ary2, check ary1[i] <= ary2[j] and + -- increment the indices i and j. + | b1Andb2 .&. m /= 0 = comp (A.index ary1 i) (A.index ary2 j) && + go (i+1) (j+1) (m `unsafeShiftL` 1) + + -- In case a key occurs in ary1, but not ary2, only increment index j. + | b2 .&. m /= 0 = go i (j+1) (m `unsafeShiftL` 1) + + -- In case a key neither occurs in ary1 nor ary2, continue. + | otherwise = go i j (m `unsafeShiftL` 1) + + b1Andb2 = b1 .&. b2 + b1Orb2 = b1 .|. b2 + subsetBitmaps = b1Orb2 == b2 +{-# INLINABLE submapBitmapIndexed #-} + +------------------------------------------------------------------------ +-- * Combine + +-- | \(O(n+m)\) The union of two maps. If a key occurs in both maps, the +-- mapping from the first will be the mapping in the result. +-- +-- ==== __Examples__ +-- +-- >>> union (fromList [(1,'a'),(2,'b')]) (fromList [(2,'c'),(3,'d')]) +-- fromList [(1,'a'),(2,'b'),(3,'d')] +union :: (Eq k, Hashable k) => HashMap k v -> HashMap k v -> HashMap k v +union = unionWith const +{-# INLINABLE union #-} + +-- | \(O(n+m)\) The union of two maps. If a key occurs in both maps, +-- the provided function (first argument) will be used to compute the +-- result. +unionWith :: (Eq k, Hashable k) => (v -> v -> v) -> HashMap k v -> HashMap k v + -> HashMap k v +unionWith f = unionWithKey (const f) +{-# INLINE unionWith #-} + +-- | \(O(n+m)\) The union of two maps. If a key occurs in both maps, +-- the provided function (first argument) will be used to compute the +-- result. +unionWithKey :: (Eq k, Hashable k) => (k -> v -> v -> v) -> HashMap k v -> HashMap k v + -> HashMap k v +unionWithKey f = go 0 + where + -- empty vs. anything + go !_ t1 Empty = t1 + go _ Empty t2 = t2 + -- leaf vs. leaf + go s t1@(Leaf h1 l1@(L k1 v1)) t2@(Leaf h2 l2@(L k2 v2)) + | h1 == h2 = if k1 == k2 + then Leaf h1 (L k1 (f k1 v1 v2)) + else collision h1 l1 l2 + | otherwise = goDifferentHash s h1 h2 t1 t2 + go s t1@(Leaf h1 (L k1 v1)) t2@(Collision h2 ls2) + | h1 == h2 = Collision h1 (updateOrSnocWithKey (\k a b -> (# f k a b #)) k1 v1 ls2) + | otherwise = goDifferentHash s h1 h2 t1 t2 + go s t1@(Collision h1 ls1) t2@(Leaf h2 (L k2 v2)) + | h1 == h2 = Collision h1 (updateOrSnocWithKey (\k a b -> (# f k b a #)) k2 v2 ls1) + | otherwise = goDifferentHash s h1 h2 t1 t2 + go s t1@(Collision h1 ls1) t2@(Collision h2 ls2) + | h1 == h2 = Collision h1 (updateOrConcatWithKey (\k a b -> (# f k a b #)) ls1 ls2) + | otherwise = goDifferentHash s h1 h2 t1 t2 + -- branch vs. branch + go s (BitmapIndexed b1 ary1) (BitmapIndexed b2 ary2) = + let b' = b1 .|. b2 + ary' = unionArrayBy (go (s+bitsPerSubkey)) b1 b2 ary1 ary2 + in bitmapIndexedOrFull b' ary' + go s (BitmapIndexed b1 ary1) (Full ary2) = + let ary' = unionArrayBy (go (s+bitsPerSubkey)) b1 fullNodeMask ary1 ary2 + in Full ary' + go s (Full ary1) (BitmapIndexed b2 ary2) = + let ary' = unionArrayBy (go (s+bitsPerSubkey)) fullNodeMask b2 ary1 ary2 + in Full ary' + go s (Full ary1) (Full ary2) = + let ary' = unionArrayBy (go (s+bitsPerSubkey)) fullNodeMask fullNodeMask + ary1 ary2 + in Full ary' + -- leaf vs. branch + go s (BitmapIndexed b1 ary1) t2 + | b1 .&. m2 == 0 = let ary' = A.insert ary1 i t2 + b' = b1 .|. m2 + in bitmapIndexedOrFull b' ary' + | otherwise = let ary' = A.updateWith' ary1 i $ \st1 -> + go (s+bitsPerSubkey) st1 t2 + in BitmapIndexed b1 ary' + where + h2 = leafHashCode t2 + m2 = mask h2 s + i = sparseIndex b1 m2 + go s t1 (BitmapIndexed b2 ary2) + | b2 .&. m1 == 0 = let ary' = A.insert ary2 i $! t1 + b' = b2 .|. m1 + in bitmapIndexedOrFull b' ary' + | otherwise = let ary' = A.updateWith' ary2 i $ \st2 -> + go (s+bitsPerSubkey) t1 st2 + in BitmapIndexed b2 ary' + where + h1 = leafHashCode t1 + m1 = mask h1 s + i = sparseIndex b2 m1 + go s (Full ary1) t2 = + let h2 = leafHashCode t2 + i = index h2 s + ary' = update32With' ary1 i $ \st1 -> go (s+bitsPerSubkey) st1 t2 + in Full ary' + go s t1 (Full ary2) = + let h1 = leafHashCode t1 + i = index h1 s + ary' = update32With' ary2 i $ \st2 -> go (s+bitsPerSubkey) t1 st2 + in Full ary' + + leafHashCode (Leaf h _) = h + leafHashCode (Collision h _) = h + leafHashCode _ = error "leafHashCode" + + goDifferentHash s h1 h2 t1 t2 + | m1 == m2 = BitmapIndexed m1 (A.singleton $! goDifferentHash (s+bitsPerSubkey) h1 h2 t1 t2) + | m1 < m2 = BitmapIndexed (m1 .|. m2) (A.pair t1 t2) + | otherwise = BitmapIndexed (m1 .|. m2) (A.pair t2 t1) + where + m1 = mask h1 s + m2 = mask h2 s +{-# INLINE unionWithKey #-} + +-- | Strict in the result of @f@. +unionArrayBy :: (a -> a -> a) -> Bitmap -> Bitmap -> A.Array a -> A.Array a + -> A.Array a +-- The manual forcing of @b1@, @b2@, @ary1@ and @ary2@ results in handsome +-- Core size reductions with GHC 9.2.2. See the Core diffs in +-- https://github.com/haskell-unordered-containers/unordered-containers/pull/376. +unionArrayBy f !b1 !b2 !ary1 !ary2 = A.run $ do + let bCombined = b1 .|. b2 + mary <- A.new_ (popCount bCombined) + -- iterate over nonzero bits of b1 .|. b2 + let go !i !i1 !i2 !b + | b == 0 = return () + | testBit (b1 .&. b2) = do + x1 <- A.indexM ary1 i1 + x2 <- A.indexM ary2 i2 + A.write mary i $! f x1 x2 + go (i+1) (i1+1) (i2+1) b' + | testBit b1 = do + A.write mary i =<< A.indexM ary1 i1 + go (i+1) (i1+1) i2 b' + | otherwise = do + A.write mary i =<< A.indexM ary2 i2 + go (i+1) i1 (i2+1) b' + where + m = 1 `unsafeShiftL` countTrailingZeros b + testBit x = x .&. m /= 0 + b' = b .&. complement m + go 0 0 0 bCombined + return mary + -- TODO: For the case where b1 .&. b2 == b1, i.e. when one is a + -- subset of the other, we could use a slightly simpler algorithm, + -- where we copy one array, and then update. +{-# INLINE unionArrayBy #-} + +-- TODO: Figure out the time complexity of 'unions'. + +-- | Construct a set containing all elements from a list of sets. +unions :: (Eq k, Hashable k) => [HashMap k v] -> HashMap k v +unions = List.foldl' union empty +{-# INLINE unions #-} + + +------------------------------------------------------------------------ +-- * Compose + +-- | Relate the keys of one map to the values of +-- the other, by using the values of the former as keys for lookups +-- in the latter. +-- +-- Complexity: \( O (n * \log(m)) \), where \(m\) is the size of the first argument +-- +-- >>> compose (fromList [('a', "A"), ('b', "B")]) (fromList [(1,'a'),(2,'b'),(3,'z')]) +-- fromList [(1,"A"),(2,"B")] +-- +-- @ +-- ('compose' bc ab '!?') = (bc '!?') <=< (ab '!?') +-- @ +-- +-- @since 0.2.13.0 +compose :: (Eq b, Hashable b) => HashMap b c -> HashMap a b -> HashMap a c +compose bc !ab + | null bc = empty + | otherwise = mapMaybe (bc !?) ab + +------------------------------------------------------------------------ +-- * Transformations + +-- | \(O(n)\) Transform this map by applying a function to every value. +mapWithKey :: (k -> v1 -> v2) -> HashMap k v1 -> HashMap k v2 +mapWithKey f = go + where + go Empty = Empty + go (Leaf h (L k v)) = Leaf h $ L k (f k v) + go (BitmapIndexed b ary) = BitmapIndexed b $ A.map go ary + go (Full ary) = Full $ A.map go ary + -- Why map strictly over collision arrays? Because there's no + -- point suspending the O(1) work this does for each leaf. + go (Collision h ary) = Collision h $ + A.map' (\ (L k v) -> L k (f k v)) ary +{-# INLINE mapWithKey #-} + +-- | \(O(n)\) Transform this map by applying a function to every value. +map :: (v1 -> v2) -> HashMap k v1 -> HashMap k v2 +map f = mapWithKey (const f) +{-# INLINE map #-} + +-- TODO: We should be able to use mutation to create the new +-- 'HashMap'. + +-- | \(O(n)\) Perform an 'Applicative' action for each key-value pair +-- in a 'HashMap' and produce a 'HashMap' of all the results. +-- +-- Note: the order in which the actions occur is unspecified. In particular, +-- when the map contains hash collisions, the order in which the actions +-- associated with the keys involved will depend in an unspecified way on +-- their insertion order. +traverseWithKey + :: Applicative f + => (k -> v1 -> f v2) + -> HashMap k v1 -> f (HashMap k v2) +traverseWithKey f = go + where + go Empty = pure Empty + go (Leaf h (L k v)) = Leaf h . L k <$> f k v + go (BitmapIndexed b ary) = BitmapIndexed b <$> A.traverse go ary + go (Full ary) = Full <$> A.traverse go ary + go (Collision h ary) = + Collision h <$> A.traverse' (\ (L k v) -> L k <$> f k v) ary +{-# INLINE traverseWithKey #-} + +-- | \(O(n)\). +-- @'mapKeys' f s@ is the map obtained by applying @f@ to each key of @s@. +-- +-- The size of the result may be smaller if @f@ maps two or more distinct +-- keys to the same new key. In this case there is no guarantee which of the +-- associated values is chosen for the conflicting key. +-- +-- >>> mapKeys (+ 1) (fromList [(5,"a"), (3,"b")]) +-- fromList [(4,"b"),(6,"a")] +-- >>> mapKeys (\ _ -> 1) (fromList [(1,"b"), (2,"a"), (3,"d"), (4,"c")]) +-- fromList [(1,"c")] +-- >>> mapKeys (\ _ -> 3) (fromList [(1,"b"), (2,"a"), (3,"d"), (4,"c")]) +-- fromList [(3,"c")] +-- +-- @since 0.2.14.0 +mapKeys :: (Eq k2, Hashable k2) => (k1 -> k2) -> HashMap k1 v -> HashMap k2 v +mapKeys f = fromList . foldrWithKey (\k x xs -> (f k, x) : xs) [] + +------------------------------------------------------------------------ +-- * Difference and intersection + +-- | \(O(n \log m)\) Difference of two maps. Return elements of the first map +-- not existing in the second. +difference :: (Eq k, Hashable k) => HashMap k v -> HashMap k w -> HashMap k v +difference a b = foldlWithKey' go empty a + where + go m k v = case lookup k b of + Nothing -> unsafeInsert k v m + _ -> m +{-# INLINABLE difference #-} + +-- | \(O(n \log m)\) Difference with a combining function. When two equal keys are +-- encountered, the combining function is applied to the values of these keys. +-- If it returns 'Nothing', the element is discarded (proper set difference). If +-- it returns (@'Just' y@), the element is updated with a new value @y@. +differenceWith :: (Eq k, Hashable k) => (v -> w -> Maybe v) -> HashMap k v -> HashMap k w -> HashMap k v +differenceWith f a b = foldlWithKey' go empty a + where + go m k v = case lookup k b of + Nothing -> unsafeInsert k v m + Just w -> maybe m (\y -> unsafeInsert k y m) (f v w) +{-# INLINABLE differenceWith #-} + +-- | \(O(n \log m)\) Intersection of two maps. Return elements of the first +-- map for keys existing in the second. +intersection :: (Eq k, Hashable k) => HashMap k v -> HashMap k w -> HashMap k v +intersection = Exts.inline intersectionWith const +{-# INLINABLE intersection #-} + +-- | \(O(n \log m)\) Intersection of two maps. If a key occurs in both maps +-- the provided function is used to combine the values from the two +-- maps. +intersectionWith :: (Eq k, Hashable k) => (v1 -> v2 -> v3) -> HashMap k v1 -> HashMap k v2 -> HashMap k v3 +intersectionWith f = Exts.inline intersectionWithKey $ const f +{-# INLINABLE intersectionWith #-} + +-- | \(O(n \log m)\) Intersection of two maps. If a key occurs in both maps +-- the provided function is used to combine the values from the two +-- maps. +intersectionWithKey :: (Eq k, Hashable k) => (k -> v1 -> v2 -> v3) -> HashMap k v1 -> HashMap k v2 -> HashMap k v3 +intersectionWithKey f = intersectionWithKey# $ \k v1 v2 -> (# f k v1 v2 #) +{-# INLINABLE intersectionWithKey #-} + +intersectionWithKey# :: Eq k => (k -> v1 -> v2 -> (# v3 #)) -> HashMap k v1 -> HashMap k v2 -> HashMap k v3 +intersectionWithKey# f = go 0 + where + -- empty vs. anything + go !_ _ Empty = Empty + go _ Empty _ = Empty + -- leaf vs. anything + go s (Leaf h1 (L k1 v1)) t2 = + lookupCont + (\_ -> Empty) + (\v _ -> case f k1 v1 v of (# v' #) -> Leaf h1 $ L k1 v') + h1 k1 s t2 + go s t1 (Leaf h2 (L k2 v2)) = + lookupCont + (\_ -> Empty) + (\v _ -> case f k2 v v2 of (# v' #) -> Leaf h2 $ L k2 v') + h2 k2 s t1 + -- collision vs. collision + go _ (Collision h1 ls1) (Collision h2 ls2) = intersectionCollisions f h1 h2 ls1 ls2 + -- branch vs. branch + go s (BitmapIndexed b1 ary1) (BitmapIndexed b2 ary2) = + intersectionArrayBy (go (s + bitsPerSubkey)) b1 b2 ary1 ary2 + go s (BitmapIndexed b1 ary1) (Full ary2) = + intersectionArrayBy (go (s + bitsPerSubkey)) b1 fullNodeMask ary1 ary2 + go s (Full ary1) (BitmapIndexed b2 ary2) = + intersectionArrayBy (go (s + bitsPerSubkey)) fullNodeMask b2 ary1 ary2 + go s (Full ary1) (Full ary2) = + intersectionArrayBy (go (s + bitsPerSubkey)) fullNodeMask fullNodeMask ary1 ary2 + -- collision vs. branch + go s (BitmapIndexed b1 ary1) t2@(Collision h2 _ls2) + | b1 .&. m2 == 0 = Empty + | otherwise = go (s + bitsPerSubkey) (A.index ary1 i) t2 + where + m2 = mask h2 s + i = sparseIndex b1 m2 + go s t1@(Collision h1 _ls1) (BitmapIndexed b2 ary2) + | b2 .&. m1 == 0 = Empty + | otherwise = go (s + bitsPerSubkey) t1 (A.index ary2 i) + where + m1 = mask h1 s + i = sparseIndex b2 m1 + go s (Full ary1) t2@(Collision h2 _ls2) = go (s + bitsPerSubkey) (A.index ary1 i) t2 + where + i = index h2 s + go s t1@(Collision h1 _ls1) (Full ary2) = go (s + bitsPerSubkey) t1 (A.index ary2 i) + where + i = index h1 s +{-# INLINE intersectionWithKey# #-} + +intersectionArrayBy :: + ( HashMap k v1 -> + HashMap k v2 -> + HashMap k v3 + ) -> + Bitmap -> + Bitmap -> + A.Array (HashMap k v1) -> + A.Array (HashMap k v2) -> + HashMap k v3 +intersectionArrayBy f !b1 !b2 !ary1 !ary2 + | b1 .&. b2 == 0 = Empty + | otherwise = runST $ do + mary <- A.new_ $ popCount bIntersect + -- iterate over nonzero bits of b1 .|. b2 + let go !i !i1 !i2 !b !bFinal + | b == 0 = pure (i, bFinal) + | testBit $ b1 .&. b2 = do + x1 <- A.indexM ary1 i1 + x2 <- A.indexM ary2 i2 + case f x1 x2 of + Empty -> go i (i1 + 1) (i2 + 1) b' (bFinal .&. complement m) + _ -> do + A.write mary i $! f x1 x2 + go (i + 1) (i1 + 1) (i2 + 1) b' bFinal + | testBit b1 = go i (i1 + 1) i2 b' bFinal + | otherwise = go i i1 (i2 + 1) b' bFinal + where + m = 1 `unsafeShiftL` countTrailingZeros b + testBit x = x .&. m /= 0 + b' = b .&. complement m + (len, bFinal) <- go 0 0 0 bCombined bIntersect + case len of + 0 -> pure Empty + 1 -> do + l <- A.read mary 0 + if isLeafOrCollision l + then pure l + else BitmapIndexed bFinal <$> (A.unsafeFreeze =<< A.shrink mary 1) + _ -> bitmapIndexedOrFull bFinal <$> (A.unsafeFreeze =<< A.shrink mary len) + where + bCombined = b1 .|. b2 + bIntersect = b1 .&. b2 +{-# INLINE intersectionArrayBy #-} + +intersectionCollisions :: Eq k => (k -> v1 -> v2 -> (# v3 #)) -> Hash -> Hash -> A.Array (Leaf k v1) -> A.Array (Leaf k v2) -> HashMap k v3 +intersectionCollisions f h1 h2 ary1 ary2 + | h1 == h2 = runST $ do + mary2 <- A.thaw ary2 0 $ A.length ary2 + mary <- A.new_ $ min (A.length ary1) (A.length ary2) + let go i j + | i >= A.length ary1 || j >= A.lengthM mary2 = pure j + | otherwise = do + L k1 v1 <- A.indexM ary1 i + searchSwap k1 j mary2 >>= \case + Just (L _k2 v2) -> do + let !(# v3 #) = f k1 v1 v2 + A.write mary j $ L k1 v3 + go (i + 1) (j + 1) + Nothing -> do + go (i + 1) j + len <- go 0 0 + case len of + 0 -> pure Empty + 1 -> Leaf h1 <$> A.read mary 0 + _ -> Collision h1 <$> (A.unsafeFreeze =<< A.shrink mary len) + | otherwise = Empty +{-# INLINE intersectionCollisions #-} + +-- | Say we have +-- @ +-- 1 2 3 4 +-- @ +-- and we search for @3@. Then we can mutate the array to +-- @ +-- undefined 2 1 4 +-- @ +-- We don't actually need to write undefined, we just have to make sure that the next search starts 1 after the current one. +searchSwap :: Eq k => k -> Int -> A.MArray s (Leaf k v) -> ST s (Maybe (Leaf k v)) +searchSwap toFind start = go start toFind start + where + go i0 k i mary + | i >= A.lengthM mary = pure Nothing + | otherwise = do + l@(L k' _v) <- A.read mary i + if k == k' + then do + A.write mary i =<< A.read mary i0 + pure $ Just l + else go i0 k (i + 1) mary +{-# INLINE searchSwap #-} + +------------------------------------------------------------------------ +-- * Folds + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- left-identity of the operator). Each application of the operator +-- is evaluated before using the result in the next application. +-- This function is strict in the starting value. +foldl' :: (a -> v -> a) -> a -> HashMap k v -> a +foldl' f = foldlWithKey' (\ z _ v -> f z v) +{-# INLINE foldl' #-} + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- right-identity of the operator). Each application of the operator +-- is evaluated before using the result in the next application. +-- This function is strict in the starting value. +foldr' :: (v -> a -> a) -> a -> HashMap k v -> a +foldr' f = foldrWithKey' (\ _ v z -> f v z) +{-# INLINE foldr' #-} + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- left-identity of the operator). Each application of the operator +-- is evaluated before using the result in the next application. +-- This function is strict in the starting value. +foldlWithKey' :: (a -> k -> v -> a) -> a -> HashMap k v -> a +foldlWithKey' f = go + where + go !z Empty = z + go z (Leaf _ (L k v)) = f z k v + go z (BitmapIndexed _ ary) = A.foldl' go z ary + go z (Full ary) = A.foldl' go z ary + go z (Collision _ ary) = A.foldl' (\ z' (L k v) -> f z' k v) z ary +{-# INLINE foldlWithKey' #-} + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- right-identity of the operator). Each application of the operator +-- is evaluated before using the result in the next application. +-- This function is strict in the starting value. +foldrWithKey' :: (k -> v -> a -> a) -> a -> HashMap k v -> a +foldrWithKey' f = flip go + where + go Empty z = z + go (Leaf _ (L k v)) !z = f k v z + go (BitmapIndexed _ ary) !z = A.foldr' go z ary + go (Full ary) !z = A.foldr' go z ary + go (Collision _ ary) !z = A.foldr' (\ (L k v) z' -> f k v z') z ary +{-# INLINE foldrWithKey' #-} + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- right-identity of the operator). +foldr :: (v -> a -> a) -> a -> HashMap k v -> a +foldr f = foldrWithKey (const f) +{-# INLINE foldr #-} + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- left-identity of the operator). +foldl :: (a -> v -> a) -> a -> HashMap k v -> a +foldl f = foldlWithKey (\a _k v -> f a v) +{-# INLINE foldl #-} + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- right-identity of the operator). +foldrWithKey :: (k -> v -> a -> a) -> a -> HashMap k v -> a +foldrWithKey f = flip go + where + go Empty z = z + go (Leaf _ (L k v)) z = f k v z + go (BitmapIndexed _ ary) z = A.foldr go z ary + go (Full ary) z = A.foldr go z ary + go (Collision _ ary) z = A.foldr (\ (L k v) z' -> f k v z') z ary +{-# INLINE foldrWithKey #-} + +-- | \(O(n)\) Reduce this map by applying a binary operator to all +-- elements, using the given starting value (typically the +-- left-identity of the operator). +foldlWithKey :: (a -> k -> v -> a) -> a -> HashMap k v -> a +foldlWithKey f = go + where + go z Empty = z + go z (Leaf _ (L k v)) = f z k v + go z (BitmapIndexed _ ary) = A.foldl go z ary + go z (Full ary) = A.foldl go z ary + go z (Collision _ ary) = A.foldl (\ z' (L k v) -> f z' k v) z ary +{-# INLINE foldlWithKey #-} + +-- | \(O(n)\) Reduce the map by applying a function to each element +-- and combining the results with a monoid operation. +foldMapWithKey :: Monoid m => (k -> v -> m) -> HashMap k v -> m +foldMapWithKey f = go + where + go Empty = mempty + go (Leaf _ (L k v)) = f k v + go (BitmapIndexed _ ary) = A.foldMap go ary + go (Full ary) = A.foldMap go ary + go (Collision _ ary) = A.foldMap (\ (L k v) -> f k v) ary +{-# INLINE foldMapWithKey #-} + +------------------------------------------------------------------------ +-- * Filter + +-- | \(O(n)\) Transform this map by applying a function to every value +-- and retaining only some of them. +mapMaybeWithKey :: (k -> v1 -> Maybe v2) -> HashMap k v1 -> HashMap k v2 +mapMaybeWithKey f = filterMapAux onLeaf onColl + where onLeaf (Leaf h (L k v)) | Just v' <- f k v = Just (Leaf h (L k v')) + onLeaf _ = Nothing + + onColl (L k v) | Just v' <- f k v = Just (L k v') + | otherwise = Nothing +{-# INLINE mapMaybeWithKey #-} + +-- | \(O(n)\) Transform this map by applying a function to every value +-- and retaining only some of them. +mapMaybe :: (v1 -> Maybe v2) -> HashMap k v1 -> HashMap k v2 +mapMaybe f = mapMaybeWithKey (const f) +{-# INLINE mapMaybe #-} + +-- | \(O(n)\) Filter this map by retaining only elements satisfying a +-- predicate. +filterWithKey :: forall k v. (k -> v -> Bool) -> HashMap k v -> HashMap k v +filterWithKey pred = filterMapAux onLeaf onColl + where onLeaf t@(Leaf _ (L k v)) | pred k v = Just t + onLeaf _ = Nothing + + onColl el@(L k v) | pred k v = Just el + onColl _ = Nothing +{-# INLINE filterWithKey #-} + + +-- | Common implementation for 'filterWithKey' and 'mapMaybeWithKey', +-- allowing the former to former to reuse terms. +filterMapAux :: forall k v1 v2 + . (HashMap k v1 -> Maybe (HashMap k v2)) + -> (Leaf k v1 -> Maybe (Leaf k v2)) + -> HashMap k v1 + -> HashMap k v2 +filterMapAux onLeaf onColl = go + where + go Empty = Empty + go t@Leaf{} + | Just t' <- onLeaf t = t' + | otherwise = Empty + go (BitmapIndexed b ary) = filterA ary b + go (Full ary) = filterA ary fullNodeMask + go (Collision h ary) = filterC ary h + + filterA ary0 b0 = + let !n = A.length ary0 + in runST $ do + mary <- A.new_ n + step ary0 mary b0 0 0 1 n + where + step :: A.Array (HashMap k v1) -> A.MArray s (HashMap k v2) + -> Bitmap -> Int -> Int -> Bitmap -> Int + -> ST s (HashMap k v2) + step !ary !mary !b i !j !bi n + | i >= n = case j of + 0 -> return Empty + 1 -> do + ch <- A.read mary 0 + case ch of + t | isLeafOrCollision t -> return t + _ -> BitmapIndexed b <$> A.trim mary 1 + _ -> do + ary2 <- A.trim mary j + return $! if j == maxChildren + then Full ary2 + else BitmapIndexed b ary2 + | bi .&. b == 0 = step ary mary b i j (bi `unsafeShiftL` 1) n + | otherwise = case go (A.index ary i) of + Empty -> step ary mary (b .&. complement bi) (i+1) j + (bi `unsafeShiftL` 1) n + t -> do A.write mary j t + step ary mary b (i+1) (j+1) (bi `unsafeShiftL` 1) n + + filterC ary0 h = + let !n = A.length ary0 + in runST $ do + mary <- A.new_ n + step ary0 mary 0 0 n + where + step :: A.Array (Leaf k v1) -> A.MArray s (Leaf k v2) + -> Int -> Int -> Int + -> ST s (HashMap k v2) + step !ary !mary i !j n + | i >= n = case j of + 0 -> return Empty + 1 -> do l <- A.read mary 0 + return $! Leaf h l + _ | i == j -> do ary2 <- A.unsafeFreeze mary + return $! Collision h ary2 + | otherwise -> do ary2 <- A.trim mary j + return $! Collision h ary2 + | Just el <- onColl $! A.index ary i + = A.write mary j el >> step ary mary (i+1) (j+1) n + | otherwise = step ary mary (i+1) j n +{-# INLINE filterMapAux #-} + +-- | \(O(n)\) Filter this map by retaining only elements which values +-- satisfy a predicate. +filter :: (v -> Bool) -> HashMap k v -> HashMap k v +filter p = filterWithKey (\_ v -> p v) +{-# INLINE filter #-} + +------------------------------------------------------------------------ +-- * Conversions + +-- TODO: Improve fusion rules by modelled them after the Prelude ones +-- on lists. + +-- | \(O(n)\) Return a list of this map's keys. The list is produced +-- lazily. +keys :: HashMap k v -> [k] +keys = List.map fst . toList +{-# INLINE keys #-} + +-- | \(O(n)\) Return a list of this map's values. The list is produced +-- lazily. +elems :: HashMap k v -> [v] +elems = List.map snd . toList +{-# INLINE elems #-} + +------------------------------------------------------------------------ +-- ** Lists + +-- | \(O(n)\) Return a list of this map's elements. The list is +-- produced lazily. The order of its elements is unspecified. +toList :: HashMap k v -> [(k, v)] +toList t = Exts.build (\ c z -> foldrWithKey (curry c) z t) +{-# INLINE toList #-} + +-- | \(O(n)\) Construct a map with the supplied mappings. If the list +-- contains duplicate mappings, the later mappings take precedence. +fromList :: (Eq k, Hashable k) => [(k, v)] -> HashMap k v +fromList = List.foldl' (\ m (k, v) -> unsafeInsert k v m) empty +{-# INLINABLE fromList #-} + +-- | \(O(n \log n)\) Construct a map from a list of elements. Uses +-- the provided function @f@ to merge duplicate entries with +-- @(f newVal oldVal)@. +-- +-- === Examples +-- +-- Given a list @xs@, create a map with the number of occurrences of each +-- element in @xs@: +-- +-- > let xs = ['a', 'b', 'a'] +-- > in fromListWith (+) [ (x, 1) | x <- xs ] +-- > +-- > = fromList [('a', 2), ('b', 1)] +-- +-- Given a list of key-value pairs @xs :: [(k, v)]@, group all values by their +-- keys and return a @HashMap k [v]@. +-- +-- > let xs = [('a', 1), ('b', 2), ('a', 3)] +-- > in fromListWith (++) [ (k, [v]) | (k, v) <- xs ] +-- > +-- > = fromList [('a', [3, 1]), ('b', [2])] +-- +-- Note that the lists in the resulting map contain elements in reverse order +-- from their occurences in the original list. +-- +-- More generally, duplicate entries are accumulated as follows; +-- this matters when @f@ is not commutative or not associative. +-- +-- > fromListWith f [(k, a), (k, b), (k, c), (k, d)] +-- > = fromList [(k, f d (f c (f b a)))] +fromListWith :: (Eq k, Hashable k) => (v -> v -> v) -> [(k, v)] -> HashMap k v +fromListWith f = List.foldl' (\ m (k, v) -> unsafeInsertWith f k v m) empty +{-# INLINE fromListWith #-} + +-- | \(O(n \log n)\) Construct a map from a list of elements. Uses +-- the provided function to merge duplicate entries. +-- +-- === Examples +-- +-- Given a list of key-value pairs where the keys are of different flavours, e.g: +-- +-- > data Key = Div | Sub +-- +-- and the values need to be combined differently when there are duplicates, +-- depending on the key: +-- +-- > combine Div = div +-- > combine Sub = (-) +-- +-- then @fromListWithKey@ can be used as follows: +-- +-- > fromListWithKey combine [(Div, 2), (Div, 6), (Sub, 2), (Sub, 3)] +-- > = fromList [(Div, 3), (Sub, 1)] +-- +-- More generally, duplicate entries are accumulated as follows; +-- +-- > fromListWith f [(k, a), (k, b), (k, c), (k, d)] +-- > = fromList [(k, f k d (f k c (f k b a)))] +-- +-- @since 0.2.11 +fromListWithKey :: (Eq k, Hashable k) => (k -> v -> v -> v) -> [(k, v)] -> HashMap k v +fromListWithKey f = List.foldl' (\ m (k, v) -> unsafeInsertWithKey (\k' a b -> (# f k' a b #)) k v m) empty +{-# INLINE fromListWithKey #-} + +------------------------------------------------------------------------ +-- Array operations + +-- | \(O(n)\) Look up the value associated with the given key in an +-- array. +lookupInArrayCont :: + forall rep (r :: TYPE rep) k v. + Eq k => ((# #) -> r) -> (v -> Int -> r) -> k -> A.Array (Leaf k v) -> r +lookupInArrayCont absent present k0 ary0 = go k0 ary0 0 (A.length ary0) + where + go :: Eq k => k -> A.Array (Leaf k v) -> Int -> Int -> r + go !k !ary !i !n + | i >= n = absent (# #) + | otherwise = case A.index ary i of + (L kx v) + | k == kx -> present v i + | otherwise -> go k ary (i+1) n +{-# INLINE lookupInArrayCont #-} + +-- | \(O(n)\) Lookup the value associated with the given key in this +-- array. Returns 'Nothing' if the key wasn't found. +indexOf :: Eq k => k -> A.Array (Leaf k v) -> Maybe Int +indexOf k0 ary0 = go k0 ary0 0 (A.length ary0) + where + go !k !ary !i !n + | i >= n = Nothing + | otherwise = case A.index ary i of + (L kx _) + | k == kx -> Just i + | otherwise -> go k ary (i+1) n +{-# INLINABLE indexOf #-} + +updateWith# :: Eq k => (v -> (# v #)) -> k -> A.Array (Leaf k v) -> A.Array (Leaf k v) +updateWith# f k0 ary0 = go k0 ary0 0 (A.length ary0) + where + go !k !ary !i !n + | i >= n = ary + | otherwise = case A.index ary i of + (L kx y) | k == kx -> case f y of + (# y' #) + | ptrEq y y' -> ary + | otherwise -> A.update ary i (L k y') + | otherwise -> go k ary (i+1) n +{-# INLINABLE updateWith# #-} + +updateOrSnocWith :: Eq k => (v -> v -> (# v #)) -> k -> v -> A.Array (Leaf k v) + -> A.Array (Leaf k v) +updateOrSnocWith f = updateOrSnocWithKey (const f) +{-# INLINABLE updateOrSnocWith #-} + +updateOrSnocWithKey :: Eq k => (k -> v -> v -> (# v #)) -> k -> v -> A.Array (Leaf k v) + -> A.Array (Leaf k v) +updateOrSnocWithKey f k0 v0 ary0 = go k0 v0 ary0 0 (A.length ary0) + where + go !k v !ary !i !n + -- Not found, append to the end. + | i >= n = A.snoc ary $ L k v + | L kx y <- A.index ary i + , k == kx + , (# v2 #) <- f k v y + = A.update ary i (L k v2) + | otherwise + = go k v ary (i+1) n +{-# INLINABLE updateOrSnocWithKey #-} + +updateOrConcatWithKey :: Eq k => (k -> v -> v -> (# v #)) -> A.Array (Leaf k v) -> A.Array (Leaf k v) -> A.Array (Leaf k v) +updateOrConcatWithKey f ary1 ary2 = A.run $ do + -- TODO: instead of mapping and then folding, should we traverse? + -- We'll have to be careful to avoid allocating pairs or similar. + + -- first: look up the position of each element of ary2 in ary1 + let indices = A.map' (\(L k _) -> indexOf k ary1) ary2 + -- that tells us how large the overlap is: + -- count number of Nothing constructors + let nOnly2 = A.foldl' (\n -> maybe (n+1) (const n)) 0 indices + let n1 = A.length ary1 + let n2 = A.length ary2 + -- copy over all elements from ary1 + mary <- A.new_ (n1 + nOnly2) + A.copy ary1 0 mary 0 n1 + -- append or update all elements from ary2 + let go !iEnd !i2 + | i2 >= n2 = return () + | otherwise = case A.index indices i2 of + Just i1 -> do -- key occurs in both arrays, store combination in position i1 + L k v1 <- A.indexM ary1 i1 + L _ v2 <- A.indexM ary2 i2 + case f k v1 v2 of (# v3 #) -> A.write mary i1 (L k v3) + go iEnd (i2+1) + Nothing -> do -- key is only in ary2, append to end + A.write mary iEnd =<< A.indexM ary2 i2 + go (iEnd+1) (i2+1) + go n1 0 + return mary +{-# INLINABLE updateOrConcatWithKey #-} + +-- | \(O(n*m)\) Check if the first array is a subset of the second array. +subsetArray :: Eq k => (v1 -> v2 -> Bool) -> A.Array (Leaf k v1) -> A.Array (Leaf k v2) -> Bool +subsetArray cmpV ary1 ary2 = A.length ary1 <= A.length ary2 && A.all inAry2 ary1 + where + inAry2 (L k1 v1) = lookupInArrayCont (\_ -> False) (\v2 _ -> cmpV v1 v2) k1 ary2 + {-# INLINE inAry2 #-} + +------------------------------------------------------------------------ +-- Manually unrolled loops + +-- | \(O(n)\) Update the element at the given position in this array. +update32 :: A.Array e -> Int -> e -> A.Array e +update32 ary idx b = runST (update32M ary idx b) +{-# INLINE update32 #-} + +-- | \(O(n)\) Update the element at the given position in this array. +update32M :: A.Array e -> Int -> e -> ST s (A.Array e) +update32M ary idx b = do + mary <- clone ary + A.write mary idx b + A.unsafeFreeze mary +{-# INLINE update32M #-} + +-- | \(O(n)\) Update the element at the given position in this array, by applying a function to it. +update32With' :: A.Array e -> Int -> (e -> e) -> A.Array e +update32With' ary idx f + | (# x #) <- A.index# ary idx + = update32 ary idx $! f x +{-# INLINE update32With' #-} + +-- | Unsafely clone an array of (2^bitsPerSubkey) elements. The length of the input +-- array is not checked. +clone :: A.Array e -> ST s (A.MArray s e) +clone ary = + A.thaw ary 0 (2^bitsPerSubkey) + +------------------------------------------------------------------------ +-- Bit twiddling + +-- TODO: Name this 'bitsPerLevel'?! What is a "subkey"? +-- https://github.com/haskell-unordered-containers/unordered-containers/issues/425 + +-- | Number of bits that are inspected at each level of the hash tree. +-- +-- This constant is named /t/ in the original /Ideal Hash Trees/ paper. +bitsPerSubkey :: Int +bitsPerSubkey = 5 + +-- | The size of a 'Full' node, i.e. @2 ^ 'bitsPerSubkey'@. +maxChildren :: Int +maxChildren = 1 `unsafeShiftL` bitsPerSubkey + +-- | Bit mask with the lowest 'bitsPerSubkey' bits set, i.e. @0b11111@. +subkeyMask :: Word +subkeyMask = 1 `unsafeShiftL` bitsPerSubkey - 1 + +-- | Given a 'Hash' and a 'Shift' that indicates the level in the tree, compute +-- the index into a 'Full' node or into the bitmap of a `BitmapIndexed` node. +-- +-- >>> index 0b0010_0010 0 +-- 0b0000_0010 +index :: Hash -> Shift -> Int +index w s = fromIntegral $ unsafeShiftR w s .&. subkeyMask +{-# INLINE index #-} + +-- | Given a 'Hash' and a 'Shift' that indicates the level in the tree, compute +-- the bitmap that contains only the 'index' of the hash at this level. +-- +-- The result can be used for constructing one-element 'BitmapIndexed' nodes or +-- to check whether a 'BitmapIndexed' node may possibly contain the given 'Hash'. +-- +-- >>> mask 0b0010_0010 0 +-- 0b0100 +mask :: Hash -> Shift -> Bitmap +mask w s = 1 `unsafeShiftL` index w s +{-# INLINE mask #-} + +-- | This array index is computed by counting the number of bits below the +-- 'index' represented by the mask. +-- +-- >>> sparseIndex 0b0110_0110 0b0010_0000 +-- 2 +sparseIndex + :: Bitmap + -- ^ Bitmap of a 'BitmapIndexed' node + -> Bitmap + -- ^ One-bit 'mask' corresponding to the 'index' of a hash + -> Int + -- ^ Index into the array of the 'BitmapIndexed' node +sparseIndex b m = popCount (b .&. (m - 1)) +{-# INLINE sparseIndex #-} + +-- TODO: Should be named _(bit)map_ instead of _mask_ + +-- | A bitmap with the 'maxChildren' least significant bits set, i.e. +-- @0xFF_FF_FF_FF@. +fullNodeMask :: Bitmap +-- This needs to use 'shiftL' instead of 'unsafeShiftL', to avoid UB. +-- See issue #412. +fullNodeMask = complement (complement 0 `shiftL` maxChildren) +{-# INLINE fullNodeMask #-} + +------------------------------------------------------------------------ +-- Pointer equality + +-- | Check if two the two arguments are the same value. N.B. This +-- function might give false negatives (due to GC moving objects.) +ptrEq :: a -> a -> Bool +ptrEq x y = Exts.isTrue# (Exts.reallyUnsafePtrEquality# x y ==# 1#) +{-# INLINE ptrEq #-} + +------------------------------------------------------------------------ +-- IsList instance +instance (Eq k, Hashable k) => Exts.IsList (HashMap k v) where + type Item (HashMap k v) = (k, v) + fromList = fromList + toList = toList diff --git a/Data/HashMap/Internal/Array.hs b/Data/HashMap/Internal/Array.hs new file mode 100644 index 0000000..85c0d7a --- /dev/null +++ b/Data/HashMap/Internal/Array.hs @@ -0,0 +1,614 @@ +{-# LANGUAGE BangPatterns #-} +{-# LANGUAGE CPP #-} +{-# LANGUAGE MagicHash #-} +{-# LANGUAGE Rank2Types #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE TemplateHaskellQuotes #-} +{-# LANGUAGE UnboxedTuples #-} +{-# OPTIONS_GHC -fno-full-laziness -funbox-strict-fields #-} +{-# OPTIONS_HADDOCK not-home #-} + +-- | = WARNING +-- +-- This module is considered __internal__. +-- +-- The Package Versioning Policy __does not apply__. +-- +-- The contents of this module may change __in any way whatsoever__ +-- and __without any warning__ between minor versions of this package. +-- +-- Authors importing this module are expected to track development +-- closely. +-- +-- = Description +-- +-- Zero based arrays. +-- +-- Note that no bounds checking are performed. +module Data.HashMap.Internal.Array + ( Array(..) + , MArray(..) + + -- * Creation + , new + , new_ + , singleton + , singletonM + , snoc + , pair + + -- * Basic interface + , length + , lengthM + , read + , write + , index + , indexM + , index# + , update + , updateWith' + , unsafeUpdateM + , insert + , insertM + , delete + , sameArray1 + , trim + + , unsafeFreeze + , unsafeThaw + , unsafeSameArray + , run + , copy + , copyM + + -- * Folds + , foldl + , foldl' + , foldr + , foldr' + , foldMap + , all + + , thaw + , map + , map' + , traverse + , traverse' + , toList + , fromList + , fromList' + , shrink + ) where + +import Control.Applicative (liftA2) +import Control.DeepSeq (NFData (..), NFData1 (..)) +import Control.Monad ((>=>)) +import Control.Monad.ST (runST, stToIO) +import GHC.Exts (Int (..), SmallArray#, SmallMutableArray#, + cloneSmallMutableArray#, copySmallArray#, + copySmallMutableArray#, indexSmallArray#, + newSmallArray#, readSmallArray#, + reallyUnsafePtrEquality#, sizeofSmallArray#, + sizeofSmallMutableArray#, tagToEnum#, + thawSmallArray#, unsafeCoerce#, + unsafeFreezeSmallArray#, unsafeThawSmallArray#, + writeSmallArray#) +import GHC.ST (ST (..)) +import Prelude hiding (all, filter, foldMap, foldl, foldr, length, + map, read, traverse) + +import qualified GHC.Exts as Exts +import qualified Language.Haskell.TH.Syntax as TH +#if defined(ASSERTS) +import qualified Prelude +#endif + + +#if defined(ASSERTS) +-- This fugly hack is brought by GHC's apparent reluctance to deal +-- with MagicHash and UnboxedTuples when inferring types. Eek! +# define CHECK_BOUNDS(_func_,_len_,_k_) \ +if (_k_) < 0 || (_k_) >= (_len_) then error ("Data.HashMap.Internal.Array." ++ (_func_) ++ ": bounds error, offset " ++ show (_k_) ++ ", length " ++ show (_len_)) else +# define CHECK_OP(_func_,_op_,_lhs_,_rhs_) \ +if not ((_lhs_) _op_ (_rhs_)) then error ("Data.HashMap.Internal.Array." ++ (_func_) ++ ": Check failed: _lhs_ _op_ _rhs_ (" ++ show (_lhs_) ++ " vs. " ++ show (_rhs_) ++ ")") else +# define CHECK_GT(_func_,_lhs_,_rhs_) CHECK_OP(_func_,>,_lhs_,_rhs_) +# define CHECK_LE(_func_,_lhs_,_rhs_) CHECK_OP(_func_,<=,_lhs_,_rhs_) +# define CHECK_EQ(_func_,_lhs_,_rhs_) CHECK_OP(_func_,==,_lhs_,_rhs_) +#else +# define CHECK_BOUNDS(_func_,_len_,_k_) +# define CHECK_OP(_func_,_op_,_lhs_,_rhs_) +# define CHECK_GT(_func_,_lhs_,_rhs_) +# define CHECK_LE(_func_,_lhs_,_rhs_) +# define CHECK_EQ(_func_,_lhs_,_rhs_) +#endif + +data Array a = Array { + unArray :: !(SmallArray# a) + } + +instance Show a => Show (Array a) where + show = show . toList + +-- Determines whether two arrays have the same memory address. +-- This is more reliable than testing pointer equality on the +-- Array wrappers, but it's still slightly bogus. +unsafeSameArray :: Array a -> Array b -> Bool +unsafeSameArray (Array xs) (Array ys) = + tagToEnum# (unsafeCoerce# reallyUnsafePtrEquality# xs ys) + +sameArray1 :: (a -> b -> Bool) -> Array a -> Array b -> Bool +sameArray1 eq !xs0 !ys0 + | lenxs /= lenys = False + | otherwise = go 0 xs0 ys0 + where + go !k !xs !ys + | k == lenxs = True + | (# x #) <- index# xs k + , (# y #) <- index# ys k + = eq x y && go (k + 1) xs ys + + !lenxs = length xs0 + !lenys = length ys0 + +length :: Array a -> Int +length ary = I# (sizeofSmallArray# (unArray ary)) +{-# INLINE length #-} + +data MArray s a = MArray { + unMArray :: !(SmallMutableArray# s a) + } + +lengthM :: MArray s a -> Int +lengthM mary = I# (sizeofSmallMutableArray# (unMArray mary)) +{-# INLINE lengthM #-} + +------------------------------------------------------------------------ + +instance NFData a => NFData (Array a) where + rnf = rnfArray + +rnfArray :: NFData a => Array a -> () +rnfArray ary0 = go ary0 n0 0 + where + n0 = length ary0 + go !ary !n !i + | i >= n = () + | (# x #) <- index# ary i + = rnf x `seq` go ary n (i+1) +-- We use index# just in case GHC can't see that the +-- relevant rnf is strict, or in case it actually isn't. +{-# INLINE rnfArray #-} + +-- | @since 0.2.14.0 +instance NFData1 Array where + liftRnf = liftRnfArray + +liftRnfArray :: (a -> ()) -> Array a -> () +liftRnfArray rnf0 ary0 = go ary0 n0 0 + where + n0 = length ary0 + go !ary !n !i + | i >= n = () + | (# x #) <- index# ary i + = rnf0 x `seq` go ary n (i+1) +{-# INLINE liftRnfArray #-} + +-- | Create a new mutable array of specified size, in the specified +-- state thread, with each element containing the specified initial +-- value. +new :: Int -> a -> ST s (MArray s a) +new _n@(I# n#) b = + CHECK_GT("new",_n,(0 :: Int)) + ST $ \s -> + case newSmallArray# n# b s of + (# s', ary #) -> (# s', MArray ary #) +{-# INLINE new #-} + +new_ :: Int -> ST s (MArray s a) +new_ n = new n undefinedElem + +-- | When 'Exts.shrinkSmallMutableArray#' is available, the returned array is the same as the array given, as it is shrunk in place. +-- Otherwise a copy is made. +shrink :: MArray s a -> Int -> ST s (MArray s a) +#if __GLASGOW_HASKELL__ >= 810 +shrink mary _n@(I# n#) = + CHECK_GT("shrink", _n, (0 :: Int)) + CHECK_LE("shrink", _n, (lengthM mary)) + ST $ \s -> case Exts.shrinkSmallMutableArray# (unMArray mary) n# s of + s' -> (# s', mary #) +#else +shrink mary n = cloneM mary 0 n +#endif +{-# INLINE shrink #-} + +singleton :: a -> Array a +singleton x = runST (singletonM x) +{-# INLINE singleton #-} + +singletonM :: a -> ST s (Array a) +singletonM x = new 1 x >>= unsafeFreeze +{-# INLINE singletonM #-} + +snoc :: Array a -> a -> Array a +snoc ary x = run $ do + mary <- new (n + 1) x + copy ary 0 mary 0 n + pure mary + where + n = length ary +{-# INLINE snoc #-} + +pair :: a -> a -> Array a +pair x y = run $ do + ary <- new 2 x + write ary 1 y + return ary +{-# INLINE pair #-} + +read :: MArray s a -> Int -> ST s a +read ary _i@(I# i#) = ST $ \ s -> + CHECK_BOUNDS("read", lengthM ary, _i) + readSmallArray# (unMArray ary) i# s +{-# INLINE read #-} + +write :: MArray s a -> Int -> a -> ST s () +write ary _i@(I# i#) b = ST $ \ s -> + CHECK_BOUNDS("write", lengthM ary, _i) + case writeSmallArray# (unMArray ary) i# b s of + s' -> (# s' , () #) +{-# INLINE write #-} + +index :: Array a -> Int -> a +index ary _i@(I# i#) = + CHECK_BOUNDS("index", length ary, _i) + case indexSmallArray# (unArray ary) i# of (# b #) -> b +{-# INLINE index #-} + +index# :: Array a -> Int -> (# a #) +index# ary _i@(I# i#) = + CHECK_BOUNDS("index#", length ary, _i) + indexSmallArray# (unArray ary) i# +{-# INLINE index# #-} + +indexM :: Array a -> Int -> ST s a +indexM ary _i@(I# i#) = + CHECK_BOUNDS("indexM", length ary, _i) + case indexSmallArray# (unArray ary) i# of (# b #) -> return b +{-# INLINE indexM #-} + +unsafeFreeze :: MArray s a -> ST s (Array a) +unsafeFreeze mary + = ST $ \s -> case unsafeFreezeSmallArray# (unMArray mary) s of + (# s', ary #) -> (# s', Array ary #) +{-# INLINE unsafeFreeze #-} + +unsafeThaw :: Array a -> ST s (MArray s a) +unsafeThaw ary + = ST $ \s -> case unsafeThawSmallArray# (unArray ary) s of + (# s', mary #) -> (# s', MArray mary #) +{-# INLINE unsafeThaw #-} + +run :: (forall s . ST s (MArray s e)) -> Array e +run act = runST $ act >>= unsafeFreeze +{-# INLINE run #-} + +-- | Unsafely copy the elements of an array. Array bounds are not checked. +copy :: Array e -> Int -> MArray s e -> Int -> Int -> ST s () +copy !src !_sidx@(I# sidx#) !dst !_didx@(I# didx#) _n@(I# n#) = + CHECK_LE("copy", _sidx + _n, length src) + CHECK_LE("copy", _didx + _n, lengthM dst) + ST $ \ s# -> + case copySmallArray# (unArray src) sidx# (unMArray dst) didx# n# s# of + s2 -> (# s2, () #) + +-- | Unsafely copy the elements of an array. Array bounds are not checked. +copyM :: MArray s e -> Int -> MArray s e -> Int -> Int -> ST s () +copyM !src !_sidx@(I# sidx#) !dst !_didx@(I# didx#) _n@(I# n#) = + CHECK_BOUNDS("copyM: src", lengthM src, _sidx + _n - 1) + CHECK_BOUNDS("copyM: dst", lengthM dst, _didx + _n - 1) + ST $ \ s# -> + case copySmallMutableArray# (unMArray src) sidx# (unMArray dst) didx# n# s# of + s2 -> (# s2, () #) + +cloneM :: MArray s a -> Int -> Int -> ST s (MArray s a) +cloneM _mary@(MArray mary#) _off@(I# off#) _len@(I# len#) = + CHECK_BOUNDS("cloneM_off", lengthM _mary, _off) + CHECK_BOUNDS("cloneM_end", lengthM _mary, _off + _len - 1) + ST $ \ s -> + case cloneSmallMutableArray# mary# off# len# s of + (# s', mary'# #) -> (# s', MArray mary'# #) + +-- | Create a new array of the @n@ first elements of @mary@. +trim :: MArray s a -> Int -> ST s (Array a) +trim mary n = cloneM mary 0 n >>= unsafeFreeze +{-# INLINE trim #-} + +-- | \(O(n)\) Insert an element at the given position in this array, +-- increasing its size by one. +insert :: Array e -> Int -> e -> Array e +insert ary idx b = runST (insertM ary idx b) +{-# INLINE insert #-} + +-- | \(O(n)\) Insert an element at the given position in this array, +-- increasing its size by one. +insertM :: Array e -> Int -> e -> ST s (Array e) +insertM ary idx b = + CHECK_BOUNDS("insertM", count + 1, idx) + do mary <- new (count+1) b + copy ary 0 mary 0 idx + copy ary idx mary (idx+1) (count-idx) + unsafeFreeze mary + where !count = length ary +{-# INLINE insertM #-} + +-- | \(O(n)\) Update the element at the given position in this array. +update :: Array e -> Int -> e -> Array e +update ary idx b = runST (updateM ary idx b) +{-# INLINE update #-} + +-- | \(O(n)\) Update the element at the given position in this array. +updateM :: Array e -> Int -> e -> ST s (Array e) +updateM ary idx b = + CHECK_BOUNDS("updateM", count, idx) + do mary <- thaw ary 0 count + write mary idx b + unsafeFreeze mary + where !count = length ary +{-# INLINE updateM #-} + +-- | \(O(n)\) Update the element at the given positio in this array, by +-- applying a function to it. Evaluates the element to WHNF before +-- inserting it into the array. +updateWith' :: Array e -> Int -> (e -> e) -> Array e +updateWith' ary idx f + | (# x #) <- index# ary idx + = update ary idx $! f x +{-# INLINE updateWith' #-} + +-- | \(O(1)\) Update the element at the given position in this array, +-- without copying. +unsafeUpdateM :: Array e -> Int -> e -> ST s () +unsafeUpdateM ary idx b = + CHECK_BOUNDS("unsafeUpdateM", length ary, idx) + do mary <- unsafeThaw ary + write mary idx b + _ <- unsafeFreeze mary + return () +{-# INLINE unsafeUpdateM #-} + +foldl' :: (b -> a -> b) -> b -> Array a -> b +foldl' f = \ z0 ary0 -> go ary0 (length ary0) 0 z0 + where + go ary n i !z + | i >= n = z + | otherwise + = case index# ary i of + (# x #) -> go ary n (i+1) (f z x) +{-# INLINE foldl' #-} + +foldr' :: (a -> b -> b) -> b -> Array a -> b +foldr' f = \ z0 ary0 -> go ary0 (length ary0 - 1) z0 + where + go !_ary (-1) z = z + go !ary i !z + | (# x #) <- index# ary i + = go ary (i - 1) (f x z) +{-# INLINE foldr' #-} + +foldr :: (a -> b -> b) -> b -> Array a -> b +foldr f = \ z0 ary0 -> go ary0 (length ary0) 0 z0 + where + go ary n i z + | i >= n = z + | otherwise + = case index# ary i of + (# x #) -> f x (go ary n (i+1) z) +{-# INLINE foldr #-} + +foldl :: (b -> a -> b) -> b -> Array a -> b +foldl f = \ z0 ary0 -> go ary0 (length ary0 - 1) z0 + where + go _ary (-1) z = z + go ary i z + | (# x #) <- index# ary i + = f (go ary (i - 1) z) x +{-# INLINE foldl #-} + +-- We go to a bit of trouble here to avoid appending an extra mempty. +-- The below implementation is by Mateusz Kowalczyk, who indicates that +-- benchmarks show it to be faster than one that avoids lifting out +-- lst. +foldMap :: Monoid m => (a -> m) -> Array a -> m +foldMap f = \ary0 -> case length ary0 of + 0 -> mempty + len -> + let !lst = len - 1 + go i | (# x #) <- index# ary0 i, let fx = f x = + if i == lst then fx else fx `mappend` go (i + 1) + in go 0 +{-# INLINE foldMap #-} + +-- | Verifies that a predicate holds for all elements of an array. +all :: (a -> Bool) -> Array a -> Bool +all p = foldr (\a acc -> p a && acc) True +{-# INLINE all #-} + +undefinedElem :: a +undefinedElem = error "Data.HashMap.Internal.Array: Undefined element" +{-# NOINLINE undefinedElem #-} + +thaw :: Array e -> Int -> Int -> ST s (MArray s e) +thaw !ary !_o@(I# o#) _n@(I# n#) = + CHECK_LE("thaw", _o + _n, length ary) + ST $ \ s -> case thawSmallArray# (unArray ary) o# n# s of + (# s2, mary# #) -> (# s2, MArray mary# #) +{-# INLINE thaw #-} + +-- | \(O(n)\) Delete an element at the given position in this array, +-- decreasing its size by one. +delete :: Array e -> Int -> Array e +delete ary idx = runST (deleteM ary idx) +{-# INLINE delete #-} + +-- | \(O(n)\) Delete an element at the given position in this array, +-- decreasing its size by one. +deleteM :: Array e -> Int -> ST s (Array e) +deleteM ary idx = do + CHECK_BOUNDS("deleteM", count, idx) + do mary <- new_ (count-1) + copy ary 0 mary 0 idx + copy ary (idx+1) mary idx (count-(idx+1)) + unsafeFreeze mary + where !count = length ary +{-# INLINE deleteM #-} + +map :: (a -> b) -> Array a -> Array b +map f = \ ary -> + let !n = length ary + in run $ do + mary <- new_ n + go ary mary 0 n + return mary + where + go ary mary i n + | i >= n = return () + | otherwise = do + x <- indexM ary i + write mary i $ f x + go ary mary (i+1) n +{-# INLINE map #-} + +-- | Strict version of 'map'. +map' :: (a -> b) -> Array a -> Array b +map' f = \ ary -> + let !n = length ary + in run $ do + mary <- new_ n + go ary mary 0 n + return mary + where + go ary mary i n + | i >= n = return () + | otherwise = do + x <- indexM ary i + write mary i $! f x + go ary mary (i+1) n +{-# INLINE map' #-} + +fromList :: Int -> [a] -> Array a +fromList n xs0 = + CHECK_EQ("fromList", n, Prelude.length xs0) + run $ do + mary <- new_ n + go xs0 mary 0 + return mary + where + go [] !_ !_ = return () + go (x:xs) mary i = do write mary i x + go xs mary (i+1) + +fromList' :: Int -> [a] -> Array a +fromList' n xs0 = + CHECK_EQ("fromList'", n, Prelude.length xs0) + run $ do + mary <- new_ n + go xs0 mary 0 + return mary + where + go [] !_ !_ = return () + go (!x:xs) mary i = do write mary i x + go xs mary (i+1) + +-- | @since 0.2.17.0 +instance TH.Lift a => TH.Lift (Array a) where +#if MIN_VERSION_template_haskell(2,16,0) + liftTyped ar = [|| fromList' arlen arlist ||] +#else + lift ar = [| fromList' arlen arlist |] +#endif + where + arlen = length ar + arlist = toList ar + +toList :: Array a -> [a] +toList = foldr (:) [] + +newtype STA a = STA {_runSTA :: forall s. SmallMutableArray# s a -> ST s (Array a)} + +runSTA :: Int -> STA a -> Array a +runSTA !n (STA m) = runST $ new_ n >>= \ (MArray ar) -> m ar + +traverse :: Applicative f => (a -> f b) -> Array a -> f (Array b) +traverse f = \ !ary -> + let + !len = length ary + go !i + | i == len = pure $ STA $ \mary -> unsafeFreeze (MArray mary) + | (# x #) <- index# ary i + = liftA2 (\b (STA m) -> STA $ \mary -> + write (MArray mary) i b >> m mary) + (f x) (go (i + 1)) + in runSTA len <$> go 0 +{-# INLINE [1] traverse #-} + +-- TODO: Would it be better to just use a lazy traversal +-- and then force the elements of the result? My guess is +-- yes. +traverse' :: Applicative f => (a -> f b) -> Array a -> f (Array b) +traverse' f = \ !ary -> + let + !len = length ary + go !i + | i == len = pure $ STA $ \mary -> unsafeFreeze (MArray mary) + | (# x #) <- index# ary i + = liftA2 (\ !b (STA m) -> STA $ \mary -> + write (MArray mary) i b >> m mary) + (f x) (go (i + 1)) + in runSTA len <$> go 0 +{-# INLINE [1] traverse' #-} + +-- Traversing in ST, we don't need to get fancy; we +-- can just do it directly. +traverseST :: (a -> ST s b) -> Array a -> ST s (Array b) +traverseST f = \ ary0 -> + let + !len = length ary0 + go k !mary + | k == len = return mary + | otherwise = do + x <- indexM ary0 k + y <- f x + write mary k y + go (k + 1) mary + in new_ len >>= (go 0 >=> unsafeFreeze) +{-# INLINE traverseST #-} + +traverseIO :: (a -> IO b) -> Array a -> IO (Array b) +traverseIO f = \ ary0 -> + let + !len = length ary0 + go k !mary + | k == len = return mary + | otherwise = do + x <- stToIO $ indexM ary0 k + y <- f x + stToIO $ write mary k y + go (k + 1) mary + in stToIO (new_ len) >>= (go 0 >=> stToIO . unsafeFreeze) +{-# INLINE traverseIO #-} + + +-- Why don't we have similar RULES for traverse'? The efficient +-- way to traverse strictly in IO or ST is to force results as +-- they come in, which leads to different semantics. In particular, +-- we need to ensure that +-- +-- traverse' (\x -> print x *> pure undefined) xs +-- +-- will actually print all the values and then return undefined. +-- We could add a strict mapMWithIndex, operating in an arbitrary +-- Monad, that supported such rules, but we don't have that right now. +{-# RULES +"traverse/ST" forall f. traverse f = traverseST f +"traverse/IO" forall f. traverse f = traverseIO f + #-} diff --git a/Data/HashMap/Internal/List.hs b/Data/HashMap/Internal/List.hs new file mode 100644 index 0000000..01b1d92 --- /dev/null +++ b/Data/HashMap/Internal/List.hs @@ -0,0 +1,84 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# OPTIONS_GHC -fno-full-laziness -funbox-strict-fields #-} +{-# OPTIONS_HADDOCK not-home #-} + +-- | = WARNING +-- +-- This module is considered __internal__. +-- +-- The Package Versioning Policy __does not apply__. +-- +-- The contents of this module may change __in any way whatsoever__ +-- and __without any warning__ between minor versions of this package. +-- +-- Authors importing this module are expected to track development +-- closely. +-- +-- = Description +-- +-- Extra list functions +-- +-- In separate module to aid testing. +module Data.HashMap.Internal.List + ( isPermutationBy + , deleteBy + , unorderedCompare + ) where + +import Data.List (sortBy) +import Data.Maybe (fromMaybe) +#if !MIN_VERSION_base(4,11,0) +import Data.Semigroup ((<>)) +#endif + +-- Note: previous implemenation isPermutation = null (as // bs) +-- was O(n^2) too. +-- +-- This assumes lists are of equal length +isPermutationBy :: (a -> b -> Bool) -> [a] -> [b] -> Bool +isPermutationBy f = go + where + f' = flip f + + go [] [] = True + go (x : xs) (y : ys) + | f x y = go xs ys + | otherwise = fromMaybe False $ do + xs' <- deleteBy f' y xs + ys' <- deleteBy f x ys + return (go xs' ys') + go [] (_ : _) = False + go (_ : _) [] = False + +-- The idea: +-- +-- Homogeonous version +-- +-- uc :: (a -> a -> Ordering) -> [a] -> [a] -> Ordering +-- uc c as bs = compare (sortBy c as) (sortBy c bs) +-- +-- But as we have only (a -> b -> Ordering), we cannot directly compare +-- elements from the same list. +-- +-- So when comparing elements from the list, we count how many elements are +-- "less and greater" in the other list, and use the count as a metric. +-- +unorderedCompare :: (a -> b -> Ordering) -> [a] -> [b] -> Ordering +unorderedCompare c as bs = go (sortBy cmpA as) (sortBy cmpB bs) + where + go [] [] = EQ + go [] (_ : _) = LT + go (_ : _) [] = GT + go (x : xs) (y : ys) = c x y <> go xs ys + + cmpA a a' = compare (inB a) (inB a') + cmpB b b' = compare (inA b) (inA b') + + inB a = (length $ filter (\b -> c a b == GT) bs, negate $ length $ filter (\b -> c a b == LT) bs) + inA b = (length $ filter (\a -> c a b == LT) as, negate $ length $ filter (\a -> c a b == GT) as) + +-- Returns Nothing is nothing deleted +deleteBy :: (a -> b -> Bool) -> a -> [b] -> Maybe [b] +deleteBy _ _ [] = Nothing +deleteBy eq x (y:ys) = if x `eq` y then Just ys else fmap (y :) (deleteBy eq x ys) diff --git a/Data/HashMap/Internal/Strict.hs b/Data/HashMap/Internal/Strict.hs new file mode 100644 index 0000000..2d8fb37 --- /dev/null +++ b/Data/HashMap/Internal/Strict.hs @@ -0,0 +1,753 @@ +{-# LANGUAGE BangPatterns #-} +{-# LANGUAGE CPP #-} +{-# LANGUAGE LambdaCase #-} +{-# LANGUAGE MagicHash #-} +{-# LANGUAGE PatternGuards #-} +{-# LANGUAGE Trustworthy #-} +{-# LANGUAGE UnboxedTuples #-} +{-# OPTIONS_HADDOCK not-home #-} + +------------------------------------------------------------------------ +-- | +-- Module : Data.HashMap.Strict +-- Copyright : 2010-2012 Johan Tibell +-- License : BSD-style +-- Maintainer : johan.tibell@gmail.com +-- Portability : portable +-- +-- = WARNING +-- +-- This module is considered __internal__. +-- +-- The Package Versioning Policy __does not apply__. +-- +-- The contents of this module may change __in any way whatsoever__ +-- and __without any warning__ between minor versions of this package. +-- +-- Authors importing this module are expected to track development +-- closely. +-- +-- = Description +-- +-- A map from /hashable/ keys to values. A map cannot contain +-- duplicate keys; each key can map to at most one value. A 'HashMap' +-- makes no guarantees as to the order of its elements. +-- +-- The implementation is based on /hash array mapped tries/. A +-- 'HashMap' is often faster than other tree-based set types, +-- especially when key comparison is expensive, as in the case of +-- strings. +-- +-- Many operations have a average-case complexity of \(O(\log n)\). The +-- implementation uses a large base (i.e. 32) so in practice these +-- operations are constant time. +module Data.HashMap.Internal.Strict + ( + -- * Strictness properties + -- $strictness + + HashMap + + -- * Construction + , HM.empty + , singleton + + -- * Basic interface + , HM.null + , HM.size + , HM.member + , HM.lookup + , (HM.!?) + , HM.findWithDefault + , HM.lookupDefault + , (HM.!) + , insert + , insertWith + , HM.delete + , adjust + , update + , alter + , alterF + , HM.isSubmapOf + , HM.isSubmapOfBy + + -- * Combine + -- ** Union + , HM.union + , unionWith + , unionWithKey + , HM.unions + + -- ** Compose + , HM.compose + + -- * Transformations + , map + , mapWithKey + , traverseWithKey + , HM.mapKeys + + -- * Difference and intersection + , HM.difference + , differenceWith + , HM.intersection + , intersectionWith + , intersectionWithKey + + -- * Folds + , HM.foldMapWithKey + , HM.foldr' + , HM.foldl' + , HM.foldrWithKey' + , HM.foldlWithKey' + , HM.foldr + , HM.foldl + , HM.foldrWithKey + , HM.foldlWithKey + + -- * Filter + , HM.filter + , HM.filterWithKey + , mapMaybe + , mapMaybeWithKey + + -- * Conversions + , HM.keys + , HM.elems + + -- ** Lists + , HM.toList + , fromList + , fromListWith + , fromListWithKey + ) where + +import Control.Applicative (Const (..)) +import Control.Monad.ST (runST) +import Data.Bits ((.&.), (.|.)) +import Data.Coerce (coerce) +import Data.Functor.Identity (Identity (..)) +-- See Note [Imports from Data.HashMap.Internal] +import Data.Hashable (Hashable) +import Data.HashMap.Internal (Hash, HashMap (..), Leaf (..), LookupRes (..), + bitsPerSubkey, fullNodeMask, hash, index, mask, + ptrEq, sparseIndex) +import Prelude hiding (lookup, map) + +-- See Note [Imports from Data.HashMap.Internal] +import qualified Data.HashMap.Internal as HM +import qualified Data.HashMap.Internal.Array as A +import qualified Data.List as List +import qualified GHC.Exts as Exts + +{- +Note [Imports from Data.HashMap.Internal] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +It is very important for code in this module not to make mistakes about +the strictness properties of any utilities. Mistakes can easily lead to space +leaks, see e.g. #383. + +Therefore nearly all functions imported from Data.HashMap.Internal should be +imported qualified. Only functions that do not manipulate HashMaps or their +values are exempted. +-} + +-- $strictness +-- +-- This module satisfies the following strictness properties: +-- +-- 1. Key arguments are evaluated to WHNF; +-- +-- 2. Keys and values are evaluated to WHNF before they are stored in +-- the map. + +------------------------------------------------------------------------ +-- * Construction + +-- | \(O(1)\) Construct a map with a single element. +singleton :: (Hashable k) => k -> v -> HashMap k v +singleton k !v = HM.singleton k v + +------------------------------------------------------------------------ +-- * Basic interface + +-- | \(O(\log n)\) Associate the specified value with the specified +-- key in this map. If this map previously contained a mapping for +-- the key, the old value is replaced. +insert :: (Eq k, Hashable k) => k -> v -> HashMap k v -> HashMap k v +insert k !v = HM.insert k v +{-# INLINABLE insert #-} + +-- | \(O(\log n)\) Associate the value with the key in this map. If +-- this map previously contained a mapping for the key, the old value +-- is replaced by the result of applying the given function to the new +-- and old value. Example: +-- +-- > insertWith f k v map +-- > where f new old = new + old +insertWith :: (Eq k, Hashable k) => (v -> v -> v) -> k -> v -> HashMap k v + -> HashMap k v +insertWith f k0 v0 m0 = go h0 k0 v0 0 m0 + where + h0 = hash k0 + go !h !k x !_ Empty = leaf h k x + go h k x s t@(Leaf hy l@(L ky y)) + | hy == h = if ky == k + then leaf h k (f x y) + else x `seq` HM.collision h l (L k x) + | otherwise = x `seq` runST (HM.two s h k x hy t) + go h k x s (BitmapIndexed b ary) + | b .&. m == 0 = + let ary' = A.insert ary i $! leaf h k x + in HM.bitmapIndexedOrFull (b .|. m) ary' + | otherwise = + let st = A.index ary i + st' = go h k x (s+bitsPerSubkey) st + ary' = A.update ary i $! st' + in BitmapIndexed b ary' + where m = mask h s + i = sparseIndex b m + go h k x s (Full ary) = + let st = A.index ary i + st' = go h k x (s+bitsPerSubkey) st + ary' = HM.update32 ary i $! st' + in Full ary' + where i = index h s + go h k x s t@(Collision hy v) + | h == hy = Collision h (updateOrSnocWith f k x v) + | otherwise = go h k x s $ BitmapIndexed (mask hy s) (A.singleton t) +{-# INLINABLE insertWith #-} + +-- | In-place update version of insertWith +unsafeInsertWith :: (Eq k, Hashable k) => (v -> v -> v) -> k -> v -> HashMap k v + -> HashMap k v +unsafeInsertWith f k0 v0 m0 = unsafeInsertWithKey (const f) k0 v0 m0 +{-# INLINABLE unsafeInsertWith #-} + +unsafeInsertWithKey :: (Eq k, Hashable k) => (k -> v -> v -> v) -> k -> v -> HashMap k v + -> HashMap k v +unsafeInsertWithKey f k0 v0 m0 = runST (go h0 k0 v0 0 m0) + where + h0 = hash k0 + go !h !k x !_ Empty = return $! leaf h k x + go h k x s t@(Leaf hy l@(L ky y)) + | hy == h = if ky == k + then return $! leaf h k (f k x y) + else do + let l' = x `seq` L k x + return $! HM.collision h l l' + | otherwise = x `seq` HM.two s h k x hy t + go h k x s t@(BitmapIndexed b ary) + | b .&. m == 0 = do + ary' <- A.insertM ary i $! leaf h k x + return $! HM.bitmapIndexedOrFull (b .|. m) ary' + | otherwise = do + st <- A.indexM ary i + st' <- go h k x (s+bitsPerSubkey) st + A.unsafeUpdateM ary i st' + return t + where m = mask h s + i = sparseIndex b m + go h k x s t@(Full ary) = do + st <- A.indexM ary i + st' <- go h k x (s+bitsPerSubkey) st + A.unsafeUpdateM ary i st' + return t + where i = index h s + go h k x s t@(Collision hy v) + | h == hy = return $! Collision h (updateOrSnocWithKey f k x v) + | otherwise = go h k x s $ BitmapIndexed (mask hy s) (A.singleton t) +{-# INLINABLE unsafeInsertWithKey #-} + +-- | \(O(\log n)\) Adjust the value tied to a given key in this map only +-- if it is present. Otherwise, leave the map alone. +adjust :: (Eq k, Hashable k) => (v -> v) -> k -> HashMap k v -> HashMap k v +adjust f k0 m0 = go h0 k0 0 m0 + where + h0 = hash k0 + go !_ !_ !_ Empty = Empty + go h k _ t@(Leaf hy (L ky y)) + | hy == h && ky == k = leaf h k (f y) + | otherwise = t + go h k s t@(BitmapIndexed b ary) + | b .&. m == 0 = t + | otherwise = let st = A.index ary i + st' = go h k (s+bitsPerSubkey) st + ary' = A.update ary i $! st' + in BitmapIndexed b ary' + where m = mask h s + i = sparseIndex b m + go h k s (Full ary) = + let i = index h s + st = A.index ary i + st' = go h k (s+bitsPerSubkey) st + ary' = HM.update32 ary i $! st' + in Full ary' + go h k _ t@(Collision hy v) + | h == hy = Collision h (updateWith f k v) + | otherwise = t +{-# INLINABLE adjust #-} + +-- | \(O(\log n)\) The expression @('update' f k map)@ updates the value @x@ at @k@ +-- (if it is in the map). If @(f x)@ is 'Nothing', the element is deleted. +-- If it is @('Just' y)@, the key @k@ is bound to the new value @y@. +update :: (Eq k, Hashable k) => (a -> Maybe a) -> k -> HashMap k a -> HashMap k a +update f = alter (>>= f) +{-# INLINABLE update #-} + +-- | \(O(\log n)\) The expression @('alter' f k map)@ alters the value @x@ at @k@, or +-- absence thereof. +-- +-- 'alter' can be used to insert, delete, or update a value in a map. In short: +-- +-- @ +-- 'lookup' k ('alter' f k m) = f ('lookup' k m) +-- @ +alter :: (Eq k, Hashable k) => (Maybe v -> Maybe v) -> k -> HashMap k v -> HashMap k v +alter f k m = + case f (HM.lookup k m) of + Nothing -> HM.delete k m + Just v -> insert k v m +{-# INLINABLE alter #-} + +-- | \(O(\log n)\) The expression (@'alterF' f k map@) alters the value @x@ at +-- @k@, or absence thereof. +-- +-- 'alterF' can be used to insert, delete, or update a value in a map. +-- +-- Note: 'alterF' is a flipped version of the 'at' combinator from +-- . +-- +-- @since 0.2.10 +alterF :: (Functor f, Eq k, Hashable k) + => (Maybe v -> f (Maybe v)) -> k -> HashMap k v -> f (HashMap k v) +-- Special care is taken to only calculate the hash once. When we rewrite +-- with RULES, we also ensure that we only compare the key for equality +-- once. We force the value of the map for consistency with the rewritten +-- version; otherwise someone could tell the difference using a lazy +-- @f@ and a functor that is similar to Const but not actually Const. +alterF f = \ !k !m -> + let !h = hash k + mv = HM.lookup' h k m + in (<$> f mv) $ \case + Nothing -> maybe m (const (HM.delete' h k m)) mv + Just !v' -> HM.insert' h k v' m + +-- We rewrite this function unconditionally in RULES, but we expose +-- an unfolding just in case it's used in a context where the rules +-- don't fire. +{-# INLINABLE [0] alterF #-} + +-- See notes in Data.HashMap.Internal +test_bottom :: a +test_bottom = error "Data.HashMap.alterF internal error: hit test_bottom" + +bogus# :: (# #) -> (# a #) +bogus# _ = error "Data.HashMap.alterF internal error: hit bogus#" + +impossibleAdjust :: a +impossibleAdjust = error "Data.HashMap.alterF internal error: impossible adjust" + +{-# RULES + +-- See detailed notes on alterF rules in Data.HashMap.Internal. + +"alterFWeird" forall f. alterF f = + alterFWeird (f Nothing) (f (Just test_bottom)) f + +"alterFconstant" forall (f :: Maybe a -> Identity (Maybe a)) x. + alterFWeird x x f = \ !k !m -> + Identity (case runIdentity x of {Nothing -> HM.delete k m; Just a -> insert k a m}) + +"alterFinsertWith" [1] forall (f :: Maybe a -> Identity (Maybe a)) x y. + alterFWeird (coerce (Just x)) (coerce (Just y)) f = + coerce (HM.insertModifying x (\mold -> case runIdentity (f (Just mold)) of + Nothing -> bogus# (# #) + Just !new -> (# new #))) + +-- This rule is written a bit differently than the one for lazy +-- maps because the adjust here is strict. We could write it the +-- same general way anyway, but this seems simpler. +"alterFadjust" forall (f :: Maybe a -> Identity (Maybe a)) x. + alterFWeird (coerce Nothing) (coerce (Just x)) f = + coerce (adjust (\a -> case runIdentity (f (Just a)) of + Just a' -> a' + Nothing -> impossibleAdjust)) + +"alterFlookup" forall _ign1 _ign2 (f :: Maybe a -> Const r (Maybe a)) . + alterFWeird _ign1 _ign2 f = \ !k !m -> Const (getConst (f (HM.lookup k m))) + #-} + +-- This is a very unsafe version of alterF used for RULES. When calling +-- alterFWeird x y f, the following *must* hold: +-- +-- x = f Nothing +-- y = f (Just _|_) +-- +-- Failure to abide by these laws will make demons come out of your nose. +alterFWeird + :: (Functor f, Eq k, Hashable k) + => f (Maybe v) + -> f (Maybe v) + -> (Maybe v -> f (Maybe v)) -> k -> HashMap k v -> f (HashMap k v) +alterFWeird _ _ f = alterFEager f +{-# INLINE [0] alterFWeird #-} + +-- | This is the default version of alterF that we use in most non-trivial +-- cases. It's called "eager" because it looks up the given key in the map +-- eagerly, whether or not the given function requires that information. +alterFEager :: (Functor f, Eq k, Hashable k) + => (Maybe v -> f (Maybe v)) -> k -> HashMap k v -> f (HashMap k v) +alterFEager f !k !m = (<$> f mv) $ \fres -> + case fres of + + ------------------------------ + -- Delete the key from the map. + Nothing -> case lookupRes of + + -- Key did not exist in the map to begin with, no-op + Absent -> m + + -- Key did exist, no collision + Present _ collPos -> HM.deleteKeyExists collPos h k m + + ------------------------------ + -- Update value + Just !v' -> case lookupRes of + + -- Key did not exist before, insert v' under a new key + Absent -> HM.insertNewKey h k v' m + + -- Key existed before, no hash collision + Present v collPos -> + if v `ptrEq` v' + -- If the value is identical, no-op + then m + -- If the value changed, update the value. + else HM.insertKeyExists collPos h k v' m + + where !h = hash k + !lookupRes = HM.lookupRecordCollision h k m + !mv = case lookupRes of + Absent -> Nothing + Present v _ -> Just v +{-# INLINABLE alterFEager #-} + +------------------------------------------------------------------------ +-- * Combine + +-- | \(O(n+m)\) The union of two maps. If a key occurs in both maps, +-- the provided function (first argument) will be used to compute the result. +unionWith :: (Eq k, Hashable k) => (v -> v -> v) -> HashMap k v -> HashMap k v + -> HashMap k v +unionWith f = unionWithKey (const f) +{-# INLINE unionWith #-} + +-- | \(O(n+m)\) The union of two maps. If a key occurs in both maps, +-- the provided function (first argument) will be used to compute the result. +unionWithKey :: (Eq k, Hashable k) => (k -> v -> v -> v) -> HashMap k v -> HashMap k v + -> HashMap k v +unionWithKey f = go 0 + where + -- empty vs. anything + go !_ t1 Empty = t1 + go _ Empty t2 = t2 + -- leaf vs. leaf + go s t1@(Leaf h1 l1@(L k1 v1)) t2@(Leaf h2 l2@(L k2 v2)) + | h1 == h2 = if k1 == k2 + then leaf h1 k1 (f k1 v1 v2) + else HM.collision h1 l1 l2 + | otherwise = goDifferentHash s h1 h2 t1 t2 + go s t1@(Leaf h1 (L k1 v1)) t2@(Collision h2 ls2) + | h1 == h2 = Collision h1 (updateOrSnocWithKey f k1 v1 ls2) + | otherwise = goDifferentHash s h1 h2 t1 t2 + go s t1@(Collision h1 ls1) t2@(Leaf h2 (L k2 v2)) + | h1 == h2 = Collision h1 (updateOrSnocWithKey (flip . f) k2 v2 ls1) + | otherwise = goDifferentHash s h1 h2 t1 t2 + go s t1@(Collision h1 ls1) t2@(Collision h2 ls2) + | h1 == h2 = Collision h1 (HM.updateOrConcatWithKey (\k a b -> let !v = f k a b in (# v #)) ls1 ls2) + | otherwise = goDifferentHash s h1 h2 t1 t2 + -- branch vs. branch + go s (BitmapIndexed b1 ary1) (BitmapIndexed b2 ary2) = + let b' = b1 .|. b2 + ary' = HM.unionArrayBy (go (s+bitsPerSubkey)) b1 b2 ary1 ary2 + in HM.bitmapIndexedOrFull b' ary' + go s (BitmapIndexed b1 ary1) (Full ary2) = + let ary' = HM.unionArrayBy (go (s+bitsPerSubkey)) b1 fullNodeMask ary1 ary2 + in Full ary' + go s (Full ary1) (BitmapIndexed b2 ary2) = + let ary' = HM.unionArrayBy (go (s+bitsPerSubkey)) fullNodeMask b2 ary1 ary2 + in Full ary' + go s (Full ary1) (Full ary2) = + let ary' = HM.unionArrayBy (go (s+bitsPerSubkey)) fullNodeMask fullNodeMask + ary1 ary2 + in Full ary' + -- leaf vs. branch + go s (BitmapIndexed b1 ary1) t2 + | b1 .&. m2 == 0 = let ary' = A.insert ary1 i t2 + b' = b1 .|. m2 + in HM.bitmapIndexedOrFull b' ary' + | otherwise = let ary' = A.updateWith' ary1 i $ \st1 -> + go (s+bitsPerSubkey) st1 t2 + in BitmapIndexed b1 ary' + where + h2 = leafHashCode t2 + m2 = mask h2 s + i = sparseIndex b1 m2 + go s t1 (BitmapIndexed b2 ary2) + | b2 .&. m1 == 0 = let ary' = A.insert ary2 i $! t1 + b' = b2 .|. m1 + in HM.bitmapIndexedOrFull b' ary' + | otherwise = let ary' = A.updateWith' ary2 i $ \st2 -> + go (s+bitsPerSubkey) t1 st2 + in BitmapIndexed b2 ary' + where + h1 = leafHashCode t1 + m1 = mask h1 s + i = sparseIndex b2 m1 + go s (Full ary1) t2 = + let h2 = leafHashCode t2 + i = index h2 s + ary' = HM.update32With' ary1 i $ \st1 -> go (s+bitsPerSubkey) st1 t2 + in Full ary' + go s t1 (Full ary2) = + let h1 = leafHashCode t1 + i = index h1 s + ary' = HM.update32With' ary2 i $ \st2 -> go (s+bitsPerSubkey) t1 st2 + in Full ary' + + leafHashCode (Leaf h _) = h + leafHashCode (Collision h _) = h + leafHashCode _ = error "leafHashCode" + + goDifferentHash s h1 h2 t1 t2 + | m1 == m2 = BitmapIndexed m1 (A.singleton $! goDifferentHash (s+bitsPerSubkey) h1 h2 t1 t2) + | m1 < m2 = BitmapIndexed (m1 .|. m2) (A.pair t1 t2) + | otherwise = BitmapIndexed (m1 .|. m2) (A.pair t2 t1) + where + m1 = mask h1 s + m2 = mask h2 s +{-# INLINE unionWithKey #-} + +------------------------------------------------------------------------ +-- * Transformations + +-- | \(O(n)\) Transform this map by applying a function to every value. +mapWithKey :: (k -> v1 -> v2) -> HashMap k v1 -> HashMap k v2 +mapWithKey f = go + where + go Empty = Empty + go (Leaf h (L k v)) = leaf h k (f k v) + go (BitmapIndexed b ary) = BitmapIndexed b $ A.map' go ary + go (Full ary) = Full $ A.map' go ary + go (Collision h ary) = + Collision h $ A.map' (\ (L k v) -> let !v' = f k v in L k v') ary +{-# INLINE mapWithKey #-} + +-- | \(O(n)\) Transform this map by applying a function to every value. +map :: (v1 -> v2) -> HashMap k v1 -> HashMap k v2 +map f = mapWithKey (const f) +{-# INLINE map #-} + + +------------------------------------------------------------------------ +-- * Filter + +-- | \(O(n)\) Transform this map by applying a function to every value +-- and retaining only some of them. +mapMaybeWithKey :: (k -> v1 -> Maybe v2) -> HashMap k v1 -> HashMap k v2 +mapMaybeWithKey f = HM.filterMapAux onLeaf onColl + where onLeaf (Leaf h (L k v)) | Just v' <- f k v = Just (leaf h k v') + onLeaf _ = Nothing + + onColl (L k v) | Just !v' <- f k v = Just (L k v') + | otherwise = Nothing +{-# INLINE mapMaybeWithKey #-} + +-- | \(O(n)\) Transform this map by applying a function to every value +-- and retaining only some of them. +mapMaybe :: (v1 -> Maybe v2) -> HashMap k v1 -> HashMap k v2 +mapMaybe f = mapMaybeWithKey (const f) +{-# INLINE mapMaybe #-} + +-- | \(O(n)\) Perform an 'Applicative' action for each key-value pair +-- in a 'HashMap' and produce a 'HashMap' of all the results. Each 'HashMap' +-- will be strict in all its values. +-- +-- @ +-- traverseWithKey f = fmap ('map' id) . "Data.HashMap.Lazy".'Data.HashMap.Lazy.traverseWithKey' f +-- @ +-- +-- Note: the order in which the actions occur is unspecified. In particular, +-- when the map contains hash collisions, the order in which the actions +-- associated with the keys involved will depend in an unspecified way on +-- their insertion order. +traverseWithKey + :: Applicative f + => (k -> v1 -> f v2) + -> HashMap k v1 -> f (HashMap k v2) +traverseWithKey f = go + where + go Empty = pure Empty + go (Leaf h (L k v)) = leaf h k <$> f k v + go (BitmapIndexed b ary) = BitmapIndexed b <$> A.traverse' go ary + go (Full ary) = Full <$> A.traverse' go ary + go (Collision h ary) = + Collision h <$> A.traverse' (\ (L k v) -> (L k $!) <$> f k v) ary +{-# INLINE traverseWithKey #-} + +------------------------------------------------------------------------ +-- * Difference and intersection + +-- | \(O(n \log m)\) Difference with a combining function. When two equal keys are +-- encountered, the combining function is applied to the values of these keys. +-- If it returns 'Nothing', the element is discarded (proper set difference). If +-- it returns (@'Just' y@), the element is updated with a new value @y@. +differenceWith :: (Eq k, Hashable k) => (v -> w -> Maybe v) -> HashMap k v -> HashMap k w -> HashMap k v +differenceWith f a b = HM.foldlWithKey' go HM.empty a + where + go m k v = case HM.lookup k b of + Nothing -> v `seq` HM.unsafeInsert k v m + Just w -> maybe m (\ !y -> HM.unsafeInsert k y m) (f v w) +{-# INLINABLE differenceWith #-} + +-- | \(O(n+m)\) Intersection of two maps. If a key occurs in both maps +-- the provided function is used to combine the values from the two +-- maps. +intersectionWith :: (Eq k, Hashable k) => (v1 -> v2 -> v3) -> HashMap k v1 + -> HashMap k v2 -> HashMap k v3 +intersectionWith f = Exts.inline intersectionWithKey $ const f +{-# INLINABLE intersectionWith #-} + +-- | \(O(n+m)\) Intersection of two maps. If a key occurs in both maps +-- the provided function is used to combine the values from the two +-- maps. +intersectionWithKey :: (Eq k, Hashable k) => (k -> v1 -> v2 -> v3) + -> HashMap k v1 -> HashMap k v2 -> HashMap k v3 +intersectionWithKey f = HM.intersectionWithKey# $ \k v1 v2 -> let !v3 = f k v1 v2 in (# v3 #) +{-# INLINABLE intersectionWithKey #-} + +------------------------------------------------------------------------ +-- ** Lists + +-- | \(O(n \log n)\) Construct a map with the supplied mappings. If the +-- list contains duplicate mappings, the later mappings take +-- precedence. +fromList :: (Eq k, Hashable k) => [(k, v)] -> HashMap k v +fromList = List.foldl' (\ m (k, !v) -> HM.unsafeInsert k v m) HM.empty +{-# INLINABLE fromList #-} + +-- | \(O(n \log n)\) Construct a map from a list of elements. Uses +-- the provided function @f@ to merge duplicate entries with +-- @(f newVal oldVal)@. +-- +-- === Examples +-- +-- Given a list @xs@, create a map with the number of occurrences of each +-- element in @xs@: +-- +-- > let xs = ['a', 'b', 'a'] +-- > in fromListWith (+) [ (x, 1) | x <- xs ] +-- > +-- > = fromList [('a', 2), ('b', 1)] +-- +-- Given a list of key-value pairs @xs :: [(k, v)]@, group all values by their +-- keys and return a @HashMap k [v]@. +-- +-- > let xs = ('a', 1), ('b', 2), ('a', 3)] +-- > in fromListWith (++) [ (k, [v]) | (k, v) <- xs ] +-- > +-- > = fromList [('a', [3, 1]), ('b', [2])] +-- +-- Note that the lists in the resulting map contain elements in reverse order +-- from their occurences in the original list. +-- +-- More generally, duplicate entries are accumulated as follows; +-- this matters when @f@ is not commutative or not associative. +-- +-- > fromListWith f [(k, a), (k, b), (k, c), (k, d)] +-- > = fromList [(k, f d (f c (f b a)))] +fromListWith :: (Eq k, Hashable k) => (v -> v -> v) -> [(k, v)] -> HashMap k v +fromListWith f = List.foldl' (\ m (k, v) -> unsafeInsertWith f k v m) HM.empty +{-# INLINE fromListWith #-} + +-- | \(O(n \log n)\) Construct a map from a list of elements. Uses +-- the provided function to merge duplicate entries. +-- +-- === Examples +-- +-- Given a list of key-value pairs where the keys are of different flavours, e.g: +-- +-- > data Key = Div | Sub +-- +-- and the values need to be combined differently when there are duplicates, +-- depending on the key: +-- +-- > combine Div = div +-- > combine Sub = (-) +-- +-- then @fromListWithKey@ can be used as follows: +-- +-- > fromListWithKey combine [(Div, 2), (Div, 6), (Sub, 2), (Sub, 3)] +-- > = fromList [(Div, 3), (Sub, 1)] +-- +-- More generally, duplicate entries are accumulated as follows; +-- +-- > fromListWith f [(k, a), (k, b), (k, c), (k, d)] +-- > = fromList [(k, f k d (f k c (f k b a)))] +-- +-- @since 0.2.11 +fromListWithKey :: (Eq k, Hashable k) => (k -> v -> v -> v) -> [(k, v)] -> HashMap k v +fromListWithKey f = List.foldl' (\ m (k, v) -> unsafeInsertWithKey f k v m) HM.empty +{-# INLINE fromListWithKey #-} + +------------------------------------------------------------------------ +-- Array operations + +updateWith :: Eq k => (v -> v) -> k -> A.Array (Leaf k v) -> A.Array (Leaf k v) +updateWith f k0 ary0 = go k0 ary0 0 (A.length ary0) + where + go !k !ary !i !n + | i >= n = ary + | otherwise = case A.index ary i of + (L kx y) | k == kx -> let !v' = f y in A.update ary i (L k v') + | otherwise -> go k ary (i+1) n +{-# INLINABLE updateWith #-} + +-- | Append the given key and value to the array. If the key is +-- already present, instead update the value of the key by applying +-- the given function to the new and old value (in that order). The +-- value is always evaluated to WHNF before being inserted into the +-- array. +updateOrSnocWith :: Eq k => (v -> v -> v) -> k -> v -> A.Array (Leaf k v) + -> A.Array (Leaf k v) +updateOrSnocWith f = updateOrSnocWithKey (const f) +{-# INLINABLE updateOrSnocWith #-} + +-- | Append the given key and value to the array. If the key is +-- already present, instead update the value of the key by applying +-- the given function to the new and old value (in that order). The +-- value is always evaluated to WHNF before being inserted into the +-- array. +updateOrSnocWithKey :: Eq k => (k -> v -> v -> v) -> k -> v -> A.Array (Leaf k v) + -> A.Array (Leaf k v) +updateOrSnocWithKey f k0 v0 ary0 = go k0 v0 ary0 0 (A.length ary0) + where + go !k v !ary !i !n + -- Not found, append to the end. + | i >= n = A.snoc ary $! L k $! v + | otherwise = case A.index ary i of + (L kx y) | k == kx -> let !v' = f k v y in A.update ary i (L k v') + | otherwise -> go k v ary (i+1) n +{-# INLINABLE updateOrSnocWithKey #-} + +------------------------------------------------------------------------ +-- Smart constructors +-- +-- These constructors make sure the value is in WHNF before it's +-- inserted into the constructor. + +leaf :: Hash -> k -> v -> HashMap k v +leaf h k = \ !v -> Leaf h (L k v) +{-# INLINE leaf #-} diff --git a/Data/HashMap/Lazy.hs b/Data/HashMap/Lazy.hs new file mode 100644 index 0000000..82697c0 --- /dev/null +++ b/Data/HashMap/Lazy.hs @@ -0,0 +1,118 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE Trustworthy #-} + +------------------------------------------------------------------------ +-- | +-- Module : Data.HashMap.Lazy +-- Copyright : 2010-2012 Johan Tibell +-- License : BSD-style +-- Maintainer : johan.tibell@gmail.com +-- Stability : provisional +-- Portability : portable +-- +-- A map from /hashable/ keys to values. A map cannot contain +-- duplicate keys; each key can map to at most one value. A 'HashMap' +-- makes no guarantees as to the order of its elements. +-- +-- The implementation is based on /hash array mapped tries/. A +-- 'HashMap' is often faster than other tree-based set types, +-- especially when key comparison is expensive, as in the case of +-- strings. +-- +-- Many operations have a average-case complexity of \(O(\log n)\). The +-- implementation uses a large base (i.e. 32) so in practice these +-- operations are constant time. +module Data.HashMap.Lazy + ( + -- * Strictness properties + -- $strictness + + HashMap + + -- * Construction + , empty + , singleton + + -- * Basic interface + , null + , size + , member + , lookup + , (!?) + , findWithDefault + , lookupDefault + , (!) + , insert + , insertWith + , delete + , adjust + , update + , alter + , alterF + , isSubmapOf + , isSubmapOfBy + + -- * Combine + -- ** Union + , union + , unionWith + , unionWithKey + , unions + + -- ** Compose + , compose + + -- * Transformations + , map + , mapWithKey + , traverseWithKey + , mapKeys + + -- * Difference and intersection + , difference + , differenceWith + , intersection + , intersectionWith + , intersectionWithKey + + -- * Folds + , foldMapWithKey + , foldr + , foldl + , foldr' + , foldl' + , foldrWithKey' + , foldlWithKey' + , foldrWithKey + , foldlWithKey + + -- * Filter + , filter + , filterWithKey + , mapMaybe + , mapMaybeWithKey + + -- * Conversions + , keys + , elems + + -- ** Lists + , toList + , fromList + , fromListWith + , fromListWithKey + + -- ** HashSets + , HS.keysSet + ) where + +import Data.HashMap.Internal +import Prelude () + +import qualified Data.HashSet.Internal as HS + +-- $strictness +-- +-- This module satisfies the following strictness property: +-- +-- * Key arguments are evaluated to WHNF. diff --git a/Data/HashMap/Strict.hs b/Data/HashMap/Strict.hs new file mode 100644 index 0000000..eb34d3d --- /dev/null +++ b/Data/HashMap/Strict.hs @@ -0,0 +1,120 @@ +{-# LANGUAGE Safe #-} + +------------------------------------------------------------------------ +-- | +-- Module : Data.HashMap.Strict +-- Copyright : 2010-2012 Johan Tibell +-- License : BSD-style +-- Maintainer : johan.tibell@gmail.com +-- Stability : provisional +-- Portability : portable +-- +-- A map from /hashable/ keys to values. A map cannot contain +-- duplicate keys; each key can map to at most one value. A 'HashMap' +-- makes no guarantees as to the order of its elements. +-- +-- The implementation is based on /hash array mapped tries/. A +-- 'HashMap' is often faster than other tree-based set types, +-- especially when key comparison is expensive, as in the case of +-- strings. +-- +-- Many operations have a average-case complexity of \(O(\log n)\). The +-- implementation uses a large base (i.e. 16) so in practice these +-- operations are constant time. +module Data.HashMap.Strict + ( + -- * Strictness properties + -- $strictness + + HashMap + + -- * Construction + , empty + , singleton + + -- * Basic interface + , null + , size + , member + , lookup + , (!?) + , findWithDefault + , lookupDefault + , (!) + , insert + , insertWith + , delete + , adjust + , update + , alter + , alterF + , isSubmapOf + , isSubmapOfBy + + -- * Combine + -- ** Union + , union + , unionWith + , unionWithKey + , unions + + -- ** Compose + , compose + + -- * Transformations + , map + , mapWithKey + , traverseWithKey + , mapKeys + + -- * Difference and intersection + , difference + , differenceWith + , intersection + , intersectionWith + , intersectionWithKey + + -- * Folds + , foldMapWithKey + , foldr + , foldl + , foldr' + , foldl' + , foldrWithKey' + , foldlWithKey' + , foldrWithKey + , foldlWithKey + + -- * Filter + , filter + , filterWithKey + , mapMaybe + , mapMaybeWithKey + + -- * Conversions + , keys + , elems + + -- ** Lists + , toList + , fromList + , fromListWith + , fromListWithKey + + -- ** HashSets + , HS.keysSet + ) where + +import Data.HashMap.Internal.Strict +import Prelude () + +import qualified Data.HashSet.Internal as HS + +-- $strictness +-- +-- This module satisfies the following strictness properties: +-- +-- 1. Key arguments are evaluated to WHNF; +-- +-- 2. Keys and values are evaluated to WHNF before they are stored in +-- the map. diff --git a/Data/HashSet.hs b/Data/HashSet.hs new file mode 100644 index 0000000..6d58918 --- /dev/null +++ b/Data/HashSet.hs @@ -0,0 +1,140 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE Safe #-} + +------------------------------------------------------------------------ +{-| +Module : Data.HashSet +Copyright : 2011 Bryan O'Sullivan +License : BSD-style +Maintainer : johan.tibell@gmail.com +Stability : provisional +Portability : portable + += Introduction + +'HashSet' allows you to store /unique/ elements, providing efficient insertion, +lookups, and deletion. A 'HashSet' makes no guarantees as to the order of its +elements. + +If you are storing sets of "Data.Int"s consider using "Data.IntSet" from the + package. + + +== Examples + +All the examples below assume @HashSet@ is imported qualified, and uses the following @dataStructures@ set. + +>>> import qualified Data.HashSet as HashSet +>>> let dataStructures = HashSet.fromList ["Set", "Map", "Graph", "Sequence"] + +=== Basic Operations + +Check membership in a set: + +>>> -- Check if "Map" and "Trie" are in the set of data structures. +>>> HashSet.member "Map" dataStructures +True +>>> HashSet.member "Trie" dataStructures +False + +Add a new entry to the set: + +>>> let moreDataStructures = HashSet.insert "Trie" dataStructures +>>> HashSet.member "Trie" moreDataStructures +> True + +Remove the @\"Graph\"@ entry from the set of data structures. + +>>> let fewerDataStructures = HashSet.delete "Graph" dataStructures +>>> HashSet.toList fewerDataStructures +["Map","Set","Sequence"] + + +Create a new set and combine it with our original set. + +>>> let unorderedDataStructures = HashSet.fromList ["HashSet", "HashMap"] +>>> HashSet.union dataStructures unorderedDataStructures +fromList ["Map","HashSet","Graph","HashMap","Set","Sequence"] + +=== Using custom data with HashSet + +To create a @HashSet@ of your custom type, the type must have instances for +'Data.Eq.Eq' and 'Data.Hashable.Hashable'. The @Hashable@ typeclass is defined in the + package, see the +documentation for information on how to make your type an instance of +@Hashable@. + +We'll start by setting up our custom data type: + +>>> :set -XDeriveGeneric +>>> import GHC.Generics (Generic) +>>> import Data.Hashable +>>> data Person = Person { name :: String, likesDogs :: Bool } deriving (Show, Eq, Generic) +>>> instance Hashable Person + +And now we'll use it! + +>>> let people = HashSet.fromList [Person "Lana" True, Person "Joe" False, Person "Simon" True] +>>> HashSet.filter likesDogs people +fromList [Person {name = "Simon", likesDogs = True},Person {name = "Lana", likesDogs = True}] + + +== Performance + +The implementation is based on /hash array mapped tries/. A +'HashSet' is often faster than other 'Data.Ord.Ord'-based set types, +especially when value comparisons are expensive, as in the case of +strings. + +Many operations have a average-case complexity of \(O(\log n)\). The +implementation uses a large base (i.e. 16) so in practice these +operations are constant time. +-} + +module Data.HashSet + ( + HashSet + + -- * Construction + , empty + , singleton + + -- * Combine + , union + , unions + + -- * Basic interface + , null + , size + , member + , insert + , delete + , isSubsetOf + + -- * Transformations + , map + + -- * Difference and intersection + , difference + , intersection + + -- * Folds + , foldl' + , foldr + + -- * Filter + , filter + + -- * Conversions + + -- ** Lists + , toList + , fromList + + -- * HashMaps + , toMap + , fromMap + ) where + +import Data.HashSet.Internal +import Prelude () diff --git a/Data/HashSet/Internal.hs b/Data/HashSet/Internal.hs new file mode 100644 index 0000000..e12acbf --- /dev/null +++ b/Data/HashSet/Internal.hs @@ -0,0 +1,455 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE DeriveLift #-} +{-# LANGUAGE RoleAnnotations #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE Trustworthy #-} +{-# LANGUAGE TypeFamilies #-} +{-# OPTIONS_HADDOCK not-home #-} + +------------------------------------------------------------------------ +-- | +-- Module : Data.HashSet.Internal +-- Copyright : 2011 Bryan O'Sullivan +-- License : BSD-style +-- Maintainer : johan.tibell@gmail.com +-- Portability : portable +-- +-- = WARNING +-- +-- This module is considered __internal__. +-- +-- The Package Versioning Policy __does not apply__. +-- +-- The contents of this module may change __in any way whatsoever__ +-- and __without any warning__ between minor versions of this package. +-- +-- Authors importing this module are expected to track development +-- closely. +-- +-- = Description +-- +-- A set of /hashable/ values. A set cannot contain duplicate items. +-- A 'HashSet' makes no guarantees as to the order of its elements. +-- +-- The implementation is based on /hash array mapped tries/. A +-- 'HashSet' is often faster than other tree-based set types, +-- especially when value comparison is expensive, as in the case of +-- strings. +-- +-- Many operations have a average-case complexity of \(O(\log n)\). The +-- implementation uses a large base (i.e. 32) so in practice these +-- operations are constant time. + +module Data.HashSet.Internal + ( + HashSet(..) + + -- * Construction + , empty + , singleton + + -- * Basic interface + , null + , size + , member + , insert + , delete + , isSubsetOf + + -- * Transformations + , map + + -- * Combine + , union + , unions + + -- * Difference and intersection + , difference + , intersection + + -- * Folds + , foldr + , foldr' + , foldl + , foldl' + + -- * Filter + , filter + + -- * Conversions + + -- ** Lists + , toList + , fromList + + -- * HashMaps + , toMap + , fromMap + + -- Exported from Data.HashMap.{Strict, Lazy} + , keysSet + ) where + +import Control.DeepSeq (NFData (..), NFData1 (..), liftRnf2) +import Data.Data (Constr, Data (..), DataType) +import Data.Functor.Classes +import Data.Hashable (Hashable (hashWithSalt)) +import Data.Hashable.Lifted (Hashable1 (..), Hashable2 (..)) +import Data.HashMap.Internal (HashMap, equalKeys, equalKeys1, foldMapWithKey, + foldlWithKey, foldrWithKey) +import Data.Semigroup (Semigroup (..), stimesIdempotentMonoid) +import Prelude hiding (filter, foldl, foldr, map, null) +import Text.Read + +import qualified Data.Data as Data +import qualified Data.Foldable as Foldable +import qualified Data.HashMap.Internal as H +import qualified Data.List as List +import qualified GHC.Exts as Exts +import qualified Language.Haskell.TH.Syntax as TH + +-- | A set of values. A set cannot contain duplicate values. +newtype HashSet a = HashSet { + asMap :: HashMap a () + } + +type role HashSet nominal + +-- | @since 0.2.17.0 +deriving instance TH.Lift a => TH.Lift (HashSet a) + +instance (NFData a) => NFData (HashSet a) where + rnf = rnf . asMap + {-# INLINE rnf #-} + +-- | @since 0.2.14.0 +instance NFData1 HashSet where + liftRnf rnf1 = liftRnf2 rnf1 rnf . asMap + +-- | Note that, in the presence of hash collisions, equal @HashSet@s may +-- behave differently, i.e. substitutivity may be violated: +-- +-- >>> data D = A | B deriving (Eq, Show) +-- >>> instance Hashable D where hashWithSalt salt _d = salt +-- +-- >>> x = fromList [A, B] +-- >>> y = fromList [B, A] +-- +-- >>> x == y +-- True +-- >>> toList x +-- [A,B] +-- >>> toList y +-- [B,A] +-- +-- In general, the lack of substitutivity can be observed with any function +-- that depends on the key ordering, such as folds and traversals. +instance (Eq a) => Eq (HashSet a) where + HashSet a == HashSet b = equalKeys a b + {-# INLINE (==) #-} + +instance Eq1 HashSet where + liftEq eq (HashSet a) (HashSet b) = equalKeys1 eq a b + +instance (Ord a) => Ord (HashSet a) where + compare (HashSet a) (HashSet b) = compare a b + {-# INLINE compare #-} + +instance Ord1 HashSet where + liftCompare c (HashSet a) (HashSet b) = liftCompare2 c compare a b + +instance Foldable.Foldable HashSet where + foldMap f = foldMapWithKey (\a _ -> f a) . asMap + foldr = foldr + {-# INLINE foldr #-} + foldl = foldl + {-# INLINE foldl #-} + foldl' = foldl' + {-# INLINE foldl' #-} + foldr' = foldr' + {-# INLINE foldr' #-} + toList = toList + {-# INLINE toList #-} + null = null + {-# INLINE null #-} + length = size + {-# INLINE length #-} + +-- | '<>' = 'union' +-- +-- \(O(n+m)\) +-- +-- To obtain good performance, the smaller set must be presented as +-- the first argument. +-- +-- ==== __Examples__ +-- +-- >>> fromList [1,2] <> fromList [2,3] +-- fromList [1,2,3] +instance (Hashable a, Eq a) => Semigroup (HashSet a) where + (<>) = union + {-# INLINE (<>) #-} + stimes = stimesIdempotentMonoid + {-# INLINE stimes #-} + +-- | 'mempty' = 'empty' +-- +-- 'mappend' = 'union' +-- +-- \(O(n+m)\) +-- +-- To obtain good performance, the smaller set must be presented as +-- the first argument. +-- +-- ==== __Examples__ +-- +-- >>> mappend (fromList [1,2]) (fromList [2,3]) +-- fromList [1,2,3] +instance (Hashable a, Eq a) => Monoid (HashSet a) where + mempty = empty + {-# INLINE mempty #-} + mappend = (<>) + {-# INLINE mappend #-} + +instance (Eq a, Hashable a, Read a) => Read (HashSet a) where + readPrec = parens $ prec 10 $ do + Ident "fromList" <- lexP + fromList <$> readPrec + + readListPrec = readListPrecDefault + +instance Show1 HashSet where + liftShowsPrec sp sl d m = + showsUnaryWith (liftShowsPrec sp sl) "fromList" d (toList m) + +instance (Show a) => Show (HashSet a) where + showsPrec d m = showParen (d > 10) $ + showString "fromList " . shows (toList m) + +instance (Data a, Eq a, Hashable a) => Data (HashSet a) where + gfoldl f z m = z fromList `f` toList m + toConstr _ = fromListConstr + gunfold k z c = case Data.constrIndex c of + 1 -> k (z fromList) + _ -> error "gunfold" + dataTypeOf _ = hashSetDataType + dataCast1 f = Data.gcast1 f + +instance Hashable1 HashSet where + liftHashWithSalt h s = liftHashWithSalt2 h hashWithSalt s . asMap + +instance (Hashable a) => Hashable (HashSet a) where + hashWithSalt salt = hashWithSalt salt . asMap + +fromListConstr :: Constr +fromListConstr = Data.mkConstr hashSetDataType "fromList" [] Data.Prefix + +hashSetDataType :: DataType +hashSetDataType = Data.mkDataType "Data.HashSet.Internal.HashSet" [fromListConstr] + +-- | \(O(1)\) Construct an empty set. +-- +-- >>> HashSet.empty +-- fromList [] +empty :: HashSet a +empty = HashSet H.empty + +-- | \(O(1)\) Construct a set with a single element. +-- +-- >>> HashSet.singleton 1 +-- fromList [1] +singleton :: Hashable a => a -> HashSet a +singleton a = HashSet (H.singleton a ()) +{-# INLINABLE singleton #-} + +-- | \(O(1)\) Convert to set to the equivalent 'HashMap' with @()@ values. +-- +-- >>> HashSet.toMap (HashSet.singleton 1) +-- fromList [(1,())] +toMap :: HashSet a -> HashMap a () +toMap = asMap + +-- | \(O(1)\) Convert from the equivalent 'HashMap' with @()@ values. +-- +-- >>> HashSet.fromMap (HashMap.singleton 1 ()) +-- fromList [1] +fromMap :: HashMap a () -> HashSet a +fromMap = HashSet + +-- | \(O(n)\) Produce a 'HashSet' of all the keys in the given 'HashMap'. +-- +-- >>> HashSet.keysSet (HashMap.fromList [(1, "a"), (2, "b")] +-- fromList [1,2] +-- +-- @since 0.2.10.0 +keysSet :: HashMap k a -> HashSet k +keysSet m = fromMap (() <$ m) + +-- | \(O(n \log m)\) Inclusion of sets. +-- +-- ==== __Examples__ +-- +-- >>> fromList [1,3] `isSubsetOf` fromList [1,2,3] +-- True +-- +-- >>> fromList [1,2] `isSubsetOf` fromList [1,3] +-- False +-- +-- @since 0.2.12 +isSubsetOf :: (Eq a, Hashable a) => HashSet a -> HashSet a -> Bool +isSubsetOf s1 s2 = H.isSubmapOfBy (\_ _ -> True) (asMap s1) (asMap s2) + +-- | \(O(n+m)\) Construct a set containing all elements from both sets. +-- +-- To obtain good performance, the smaller set must be presented as +-- the first argument. +-- +-- >>> union (fromList [1,2]) (fromList [2,3]) +-- fromList [1,2,3] +union :: (Eq a, Hashable a) => HashSet a -> HashSet a -> HashSet a +union s1 s2 = HashSet $ H.union (asMap s1) (asMap s2) +{-# INLINE union #-} + +-- TODO: Figure out the time complexity of 'unions'. + +-- | Construct a set containing all elements from a list of sets. +unions :: (Eq a, Hashable a) => [HashSet a] -> HashSet a +unions = List.foldl' union empty +{-# INLINE unions #-} + +-- | \(O(1)\) Return 'True' if this set is empty, 'False' otherwise. +-- +-- >>> HashSet.null HashSet.empty +-- True +-- >>> HashSet.null (HashSet.singleton 1) +-- False +null :: HashSet a -> Bool +null = H.null . asMap +{-# INLINE null #-} + +-- | \(O(n)\) Return the number of elements in this set. +-- +-- >>> HashSet.size HashSet.empty +-- 0 +-- >>> HashSet.size (HashSet.fromList [1,2,3]) +-- 3 +size :: HashSet a -> Int +size = H.size . asMap +{-# INLINE size #-} + +-- | \(O(\log n)\) Return 'True' if the given value is present in this +-- set, 'False' otherwise. +-- +-- >>> HashSet.member 1 (Hashset.fromList [1,2,3]) +-- True +-- >>> HashSet.member 1 (Hashset.fromList [4,5,6]) +-- False +member :: (Eq a, Hashable a) => a -> HashSet a -> Bool +member a s = case H.lookup a (asMap s) of + Just _ -> True + _ -> False +{-# INLINABLE member #-} + +-- | \(O(\log n)\) Add the specified value to this set. +-- +-- >>> HashSet.insert 1 HashSet.empty +-- fromList [1] +insert :: (Eq a, Hashable a) => a -> HashSet a -> HashSet a +insert a = HashSet . H.insert a () . asMap +{-# INLINABLE insert #-} + +-- | \(O(\log n)\) Remove the specified value from this set if present. +-- +-- >>> HashSet.delete 1 (HashSet.fromList [1,2,3]) +-- fromList [2,3] +-- >>> HashSet.delete 1 (HashSet.fromList [4,5,6]) +-- fromList [4,5,6] +delete :: (Eq a, Hashable a) => a -> HashSet a -> HashSet a +delete a = HashSet . H.delete a . asMap +{-# INLINABLE delete #-} + +-- | \(O(n)\) Transform this set by applying a function to every value. +-- The resulting set may be smaller than the source. +-- +-- >>> HashSet.map show (HashSet.fromList [1,2,3]) +-- HashSet.fromList ["1","2","3"] +map :: (Hashable b, Eq b) => (a -> b) -> HashSet a -> HashSet b +map f = fromList . List.map f . toList +{-# INLINE map #-} + +-- | \(O(n)\) Difference of two sets. Return elements of the first set +-- not existing in the second. +-- +-- >>> HashSet.difference (HashSet.fromList [1,2,3]) (HashSet.fromList [2,3,4]) +-- fromList [1] +difference :: (Eq a, Hashable a) => HashSet a -> HashSet a -> HashSet a +difference (HashSet a) (HashSet b) = HashSet (H.difference a b) +{-# INLINABLE difference #-} + +-- | \(O(n)\) Intersection of two sets. Return elements present in both +-- the first set and the second. +-- +-- >>> HashSet.intersection (HashSet.fromList [1,2,3]) (HashSet.fromList [2,3,4]) +-- fromList [2,3] +intersection :: (Eq a, Hashable a) => HashSet a -> HashSet a -> HashSet a +intersection (HashSet a) (HashSet b) = HashSet (H.intersection a b) +{-# INLINABLE intersection #-} + +-- | \(O(n)\) Reduce this set by applying a binary operator to all +-- elements, using the given starting value (typically the +-- left-identity of the operator). Each application of the operator +-- is evaluated before before using the result in the next +-- application. This function is strict in the starting value. +foldl' :: (a -> b -> a) -> a -> HashSet b -> a +foldl' f z0 = H.foldlWithKey' g z0 . asMap + where g z k _ = f z k +{-# INLINE foldl' #-} + +-- | \(O(n)\) Reduce this set by applying a binary operator to all +-- elements, using the given starting value (typically the +-- right-identity of the operator). Each application of the operator +-- is evaluated before before using the result in the next +-- application. This function is strict in the starting value. +foldr' :: (b -> a -> a) -> a -> HashSet b -> a +foldr' f z0 = H.foldrWithKey' g z0 . asMap + where g k _ z = f k z +{-# INLINE foldr' #-} + +-- | \(O(n)\) Reduce this set by applying a binary operator to all +-- elements, using the given starting value (typically the +-- right-identity of the operator). +foldr :: (b -> a -> a) -> a -> HashSet b -> a +foldr f z0 = foldrWithKey g z0 . asMap + where g k _ z = f k z +{-# INLINE foldr #-} + +-- | \(O(n)\) Reduce this set by applying a binary operator to all +-- elements, using the given starting value (typically the +-- left-identity of the operator). +foldl :: (a -> b -> a) -> a -> HashSet b -> a +foldl f z0 = foldlWithKey g z0 . asMap + where g z k _ = f z k +{-# INLINE foldl #-} + +-- | \(O(n)\) Filter this set by retaining only elements satisfying a +-- predicate. +filter :: (a -> Bool) -> HashSet a -> HashSet a +filter p = HashSet . H.filterWithKey q . asMap + where q k _ = p k +{-# INLINE filter #-} + +-- | \(O(n)\) Return a list of this set's elements. The list is +-- produced lazily. +toList :: HashSet a -> [a] +toList t = Exts.build (\ c z -> foldrWithKey (const . c) z (asMap t)) +{-# INLINE toList #-} + +-- | \(O(n \min(W, n))\) Construct a set from a list of elements. +fromList :: (Eq a, Hashable a) => [a] -> HashSet a +fromList = HashSet . List.foldl' (\ m k -> H.insert k () m) H.empty +{-# INLINE fromList #-} + +instance (Eq a, Hashable a) => Exts.IsList (HashSet a) where + type Item (HashSet a) = a + fromList = fromList + toList = toList diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..5eb7e1b --- /dev/null +++ b/LICENSE @@ -0,0 +1,30 @@ +Copyright (c) 2010, Johan Tibell + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of Johan Tibell nor the names of other + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/Setup.hs b/Setup.hs new file mode 100644 index 0000000..9a994af --- /dev/null +++ b/Setup.hs @@ -0,0 +1,2 @@ +import Distribution.Simple +main = defaultMain diff --git a/benchmarks/Benchmarks.hs b/benchmarks/Benchmarks.hs new file mode 100644 index 0000000..ae05c42 --- /dev/null +++ b/benchmarks/Benchmarks.hs @@ -0,0 +1,521 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE DeriveAnyClass #-} +{-# LANGUAGE DeriveGeneric #-} +{-# LANGUAGE GADTs #-} +{-# LANGUAGE PackageImports #-} +{-# LANGUAGE RecordWildCards #-} + +module Main where + +import Control.DeepSeq (NFData (..)) +import Data.Bits ((.&.)) +import Data.Functor.Identity (Identity (..)) +import Data.Hashable (Hashable, hash) +import Data.List (foldl') +import Data.Maybe (fromMaybe) +import GHC.Generics (Generic) +import Prelude hiding (lookup) +import Test.Tasty.Bench (bench, bgroup, defaultMain, env, nf, whnf) + +import qualified Data.ByteString as BS +import qualified "hashmap" Data.HashMap as IHM +import qualified Data.HashMap.Strict as HM +import qualified Data.IntMap as IM +import qualified Data.Map as M +import qualified Util.ByteString as UBS +import qualified Util.Int as UI +import qualified Util.String as US + +data B where + B :: NFData a => a -> B + +instance NFData B where + rnf (B b) = rnf b + +-- TODO: This a stopgap measure to keep the benchmark work with +-- Criterion 1.0. +data Env = Env { + n :: !Int, + + elems :: ![(String, Int)], + keys :: ![String], + elemsBS :: ![(BS.ByteString, Int)], + keysBS :: ![BS.ByteString], + elemsI :: ![(Int, Int)], + keysI :: ![Int], + elemsI2 :: ![(Int, Int)], -- for union + + keys' :: ![String], + keysBS' :: ![BS.ByteString], + keysI' :: ![Int], + + keysDup :: ![String], + keysDupBS :: ![BS.ByteString], + keysDupI :: ![Int], + elemsDup :: ![(String, Int)], + elemsDupBS :: ![(BS.ByteString, Int)], + elemsDupI :: ![(Int, Int)], + + hm :: !(HM.HashMap String Int), + hmSubset :: !(HM.HashMap String Int), + hmbs :: !(HM.HashMap BS.ByteString Int), + hmbsSubset :: !(HM.HashMap BS.ByteString Int), + hmi :: !(HM.HashMap Int Int), + hmiSubset :: !(HM.HashMap Int Int), + hmi2 :: !(HM.HashMap Int Int), + m :: !(M.Map String Int), + mSubset :: !(M.Map String Int), + mbs :: !(M.Map BS.ByteString Int), + mbsSubset :: !(M.Map BS.ByteString Int), + im :: !(IM.IntMap Int), + imSubset :: !(IM.IntMap Int), + ihm :: !(IHM.Map String Int), + ihmSubset :: !(IHM.Map String Int), + ihmbs :: !(IHM.Map BS.ByteString Int), + ihmbsSubset :: !(IHM.Map BS.ByteString Int) + } deriving (Generic, NFData) + +setupEnv :: IO Env +setupEnv = do + let n = 2^(12 :: Int) + + elems = zip keys [1..n] + keys = US.rnd 8 n + elemsBS = zip keysBS [1..n] + keysBS = UBS.rnd 8 n + elemsI = zip keysI [1..n] + keysI = UI.rnd (n+n) n + elemsI2 = zip [n `div` 2..n + (n `div` 2)] [1..n] -- for union + + keys' = US.rnd' 8 n + keysBS' = UBS.rnd' 8 n + keysI' = UI.rnd' (n+n) n + + keysDup = US.rnd 2 n + keysDupBS = UBS.rnd 2 n + keysDupI = UI.rnd (n`div`4) n + elemsDup = zip keysDup [1..n] + elemsDupBS = zip keysDupBS [1..n] + elemsDupI = zip keysDupI [1..n] + + hm = HM.fromList elems + hmSubset = HM.fromList (takeSubset n elems) + hmbs = HM.fromList elemsBS + hmbsSubset = HM.fromList (takeSubset n elemsBS) + hmi = HM.fromList elemsI + hmiSubset = HM.fromList (takeSubset n elemsI) + hmi2 = HM.fromList elemsI2 + m = M.fromList elems + mSubset = M.fromList (takeSubset n elems) + mbs = M.fromList elemsBS + mbsSubset = M.fromList (takeSubset n elemsBS) + im = IM.fromList elemsI + imSubset = IM.fromList (takeSubset n elemsI) + ihm = IHM.fromList elems + ihmSubset = IHM.fromList (takeSubset n elems) + ihmbs = IHM.fromList elemsBS + ihmbsSubset = IHM.fromList (takeSubset n elemsBS) + return Env{..} + where + takeSubset n elements = + -- use 50% of the elements for a subset check. + let subsetSize = round (fromIntegral n * 0.5 :: Double) :: Int + in take subsetSize elements + +main :: IO () +main = do + defaultMain + [ +#ifdef BENCH_containers_Map + env setupEnv $ \ ~(Env{..}) -> + -- * Comparison to other data structures + -- ** Map + bgroup "Map" + [ bgroup "lookup" + [ bench "String" $ whnf (lookupM keys) m + , bench "ByteString" $ whnf (lookupM keysBS) mbs + ] + , bgroup "lookup-miss" + [ bench "String" $ whnf (lookupM keys') m + , bench "ByteString" $ whnf (lookupM keysBS') mbs + ] + , bgroup "insert" + [ bench "String" $ whnf (insertM elems) M.empty + , bench "ByteStringString" $ whnf (insertM elemsBS) M.empty + ] + , bgroup "insert-dup" + [ bench "String" $ whnf (insertM elems) m + , bench "ByteStringString" $ whnf (insertM elemsBS) mbs + ] + , bgroup "delete" + [ bench "String" $ whnf (deleteM keys) m + , bench "ByteString" $ whnf (deleteM keysBS) mbs + ] + , bgroup "delete-miss" + [ bench "String" $ whnf (deleteM keys') m + , bench "ByteString" $ whnf (deleteM keysBS') mbs + ] + , bgroup "size" + [ bench "String" $ whnf M.size m + , bench "ByteString" $ whnf M.size mbs + ] + , bgroup "fromList" + [ bench "String" $ whnf M.fromList elems + , bench "ByteString" $ whnf M.fromList elemsBS + ] + , bgroup "isSubmapOf" + [ bench "String" $ whnf (M.isSubmapOf mSubset) m + , bench "ByteString" $ whnf (M.isSubmapOf mbsSubset) mbs + ] + ], +#endif + +#ifdef BENCH_hashmap_Map + -- ** Map from the hashmap package + env setupEnv $ \ ~(Env{..}) -> + bgroup "hashmap/Map" + [ bgroup "lookup" + [ bench "String" $ whnf (lookupIHM keys) ihm + , bench "ByteString" $ whnf (lookupIHM keysBS) ihmbs + ] + , bgroup "lookup-miss" + [ bench "String" $ whnf (lookupIHM keys') ihm + , bench "ByteString" $ whnf (lookupIHM keysBS') ihmbs + ] + , bgroup "insert" + [ bench "String" $ whnf (insertIHM elems) IHM.empty + , bench "ByteStringString" $ whnf (insertIHM elemsBS) IHM.empty + ] + , bgroup "insert-dup" + [ bench "String" $ whnf (insertIHM elems) ihm + , bench "ByteStringString" $ whnf (insertIHM elemsBS) ihmbs + ] + , bgroup "delete" + [ bench "String" $ whnf (deleteIHM keys) ihm + , bench "ByteString" $ whnf (deleteIHM keysBS) ihmbs + ] + , bgroup "delete-miss" + [ bench "String" $ whnf (deleteIHM keys') ihm + , bench "ByteString" $ whnf (deleteIHM keysBS') ihmbs + ] + , bgroup "size" + [ bench "String" $ whnf IHM.size ihm + , bench "ByteString" $ whnf IHM.size ihmbs + ] + , bgroup "fromList" + [ bench "String" $ whnf IHM.fromList elems + , bench "ByteString" $ whnf IHM.fromList elemsBS + ] + , bgroup "isSubmapOf" + [ bench "String" $ whnf (IHM.isSubmapOf ihmSubset) ihm + , bench "ByteString" $ whnf (IHM.isSubmapOf ihmbsSubset) ihmbs + ] + ], +#endif + +#ifdef BENCH_containers_IntMap + -- ** IntMap + env setupEnv $ \ ~(Env{..}) -> + bgroup "IntMap" + [ bench "lookup" $ whnf (lookupIM keysI) im + , bench "lookup-miss" $ whnf (lookupIM keysI') im + , bench "insert" $ whnf (insertIM elemsI) IM.empty + , bench "insert-dup" $ whnf (insertIM elemsI) im + , bench "delete" $ whnf (deleteIM keysI) im + , bench "delete-miss" $ whnf (deleteIM keysI') im + , bench "size" $ whnf IM.size im + , bench "fromList" $ whnf IM.fromList elemsI + , bench "isSubmapOf" $ whnf (IM.isSubmapOf imSubset) im + ], +#endif + + env setupEnv $ \ ~(Env{..}) -> + bgroup "HashMap" + [ -- * Basic interface + bgroup "lookup" + [ bench "String" $ whnf (lookup keys) hm + , bench "ByteString" $ whnf (lookup keysBS) hmbs + , bench "Int" $ whnf (lookup keysI) hmi + ] + , bgroup "lookup-miss" + [ bench "String" $ whnf (lookup keys') hm + , bench "ByteString" $ whnf (lookup keysBS') hmbs + , bench "Int" $ whnf (lookup keysI') hmi + ] + , bgroup "insert" + [ bench "String" $ whnf (insert elems) HM.empty + , bench "ByteString" $ whnf (insert elemsBS) HM.empty + , bench "Int" $ whnf (insert elemsI) HM.empty + ] + , bgroup "insert-dup" + [ bench "String" $ whnf (insert elems) hm + , bench "ByteString" $ whnf (insert elemsBS) hmbs + , bench "Int" $ whnf (insert elemsI) hmi + ] + , bgroup "delete" + [ bench "String" $ whnf (delete keys) hm + , bench "ByteString" $ whnf (delete keysBS) hmbs + , bench "Int" $ whnf (delete keysI) hmi + ] + , bgroup "delete-miss" + [ bench "String" $ whnf (delete keys') hm + , bench "ByteString" $ whnf (delete keysBS') hmbs + , bench "Int" $ whnf (delete keysI') hmi + ] + , bgroup "alterInsert" + [ bench "String" $ whnf (alterInsert elems) HM.empty + , bench "ByteString" $ whnf (alterInsert elemsBS) HM.empty + , bench "Int" $ whnf (alterInsert elemsI) HM.empty + ] + , bgroup "alterFInsert" + [ bench "String" $ whnf (alterFInsert elems) HM.empty + , bench "ByteString" $ whnf (alterFInsert elemsBS) HM.empty + , bench "Int" $ whnf (alterFInsert elemsI) HM.empty + ] + , bgroup "alterInsert-dup" + [ bench "String" $ whnf (alterInsert elems) hm + , bench "ByteString" $ whnf (alterInsert elemsBS) hmbs + , bench "Int" $ whnf (alterInsert elemsI) hmi + ] + , bgroup "alterFInsert-dup" + [ bench "String" $ whnf (alterFInsert elems) hm + , bench "ByteString" $ whnf (alterFInsert elemsBS) hmbs + , bench "Int" $ whnf (alterFInsert elemsI) hmi + ] + , bgroup "alterDelete" + [ bench "String" $ whnf (alterDelete keys) hm + , bench "ByteString" $ whnf (alterDelete keysBS) hmbs + , bench "Int" $ whnf (alterDelete keysI) hmi + ] + , bgroup "alterFDelete" + [ bench "String" $ whnf (alterFDelete keys) hm + , bench "ByteString" $ whnf (alterFDelete keysBS) hmbs + , bench "Int" $ whnf (alterFDelete keysI) hmi + ] + , bgroup "alterDelete-miss" + [ bench "String" $ whnf (alterDelete keys') hm + , bench "ByteString" $ whnf (alterDelete keysBS') hmbs + , bench "Int" $ whnf (alterDelete keysI') hmi + ] + , bgroup "alterFDelete-miss" + [ bench "String" $ whnf (alterFDelete keys') hm + , bench "ByteString" $ whnf (alterFDelete keysBS') hmbs + , bench "Int" $ whnf (alterFDelete keysI') hmi + ] + , bgroup "isSubmapOf" + [ bench "String" $ whnf (HM.isSubmapOf hmSubset) hm + , bench "ByteString" $ whnf (HM.isSubmapOf hmbsSubset) hmbs + , bench "Int" $ whnf (HM.isSubmapOf hmiSubset) hmi + ] + , bgroup "isSubmapOfNaive" + [ bench "String" $ whnf (isSubmapOfNaive hmSubset) hm + , bench "ByteString" $ whnf (isSubmapOfNaive hmbsSubset) hmbs + , bench "Int" $ whnf (isSubmapOfNaive hmiSubset) hmi + ] + + -- Combine + , bgroup "union" + [ bench "Int" $ whnf (HM.union hmi) hmi2 + , bench "ByteString" $ whnf (HM.union hmbs) hmbsSubset + ] + + , bgroup "intersection" + [ bench "Int" $ whnf (HM.intersection hmi) hmi2 + , bench "ByteString" $ whnf (HM.intersection hmbs) hmbsSubset + ] + + -- Transformations + , bench "map" $ whnf (HM.map (\ v -> v + 1)) hmi + + -- * Difference and intersection + , bench "difference" $ whnf (HM.difference hmi) hmi2 + + -- Folds + , bench "foldl'" $ whnf (HM.foldl' (+) 0) hmi + , bench "foldr" $ nf (HM.foldr (:) []) hmi + + -- Filter + , bench "filter" $ whnf (HM.filter (\ v -> v .&. 1 == 0)) hmi + , bench "filterWithKey" $ whnf (HM.filterWithKey (\ k _ -> k .&. 1 == 0)) hmi + + -- Size + , bgroup "size" + [ bench "String" $ whnf HM.size hm + , bench "ByteString" $ whnf HM.size hmbs + , bench "Int" $ whnf HM.size hmi + ] + + -- fromList + , bgroup "fromList" + [ bgroup "long" + [ bench "String" $ whnf HM.fromList elems + , bench "ByteString" $ whnf HM.fromList elemsBS + , bench "Int" $ whnf HM.fromList elemsI + ] + , bgroup "short" + [ bench "String" $ whnf HM.fromList elemsDup + , bench "ByteString" $ whnf HM.fromList elemsDupBS + , bench "Int" $ whnf HM.fromList elemsDupI + ] + ] + -- fromListWith + , bgroup "fromListWith" + [ bgroup "long" + [ bench "String" $ whnf (HM.fromListWith (+)) elems + , bench "ByteString" $ whnf (HM.fromListWith (+)) elemsBS + , bench "Int" $ whnf (HM.fromListWith (+)) elemsI + ] + , bgroup "short" + [ bench "String" $ whnf (HM.fromListWith (+)) elemsDup + , bench "ByteString" $ whnf (HM.fromListWith (+)) elemsDupBS + , bench "Int" $ whnf (HM.fromListWith (+)) elemsDupI + ] + ] + -- Hashable instance + , bgroup "hash" + [ bench "String" $ whnf hash hm + , bench "ByteString" $ whnf hash hmbs + ] + ] + ] + +------------------------------------------------------------------------ +-- * HashMap + +lookup :: (Eq k, Hashable k) => [k] -> HM.HashMap k Int -> Int +lookup xs m = foldl' (\z k -> fromMaybe z (HM.lookup k m)) 0 xs +{-# SPECIALIZE lookup :: [Int] -> HM.HashMap Int Int -> Int #-} +{-# SPECIALIZE lookup :: [String] -> HM.HashMap String Int -> Int #-} +{-# SPECIALIZE lookup :: [BS.ByteString] -> HM.HashMap BS.ByteString Int + -> Int #-} + +insert :: (Eq k, Hashable k) => [(k, Int)] -> HM.HashMap k Int + -> HM.HashMap k Int +insert xs m0 = foldl' (\m (k, v) -> HM.insert k v m) m0 xs +{-# SPECIALIZE insert :: [(Int, Int)] -> HM.HashMap Int Int + -> HM.HashMap Int Int #-} +{-# SPECIALIZE insert :: [(String, Int)] -> HM.HashMap String Int + -> HM.HashMap String Int #-} +{-# SPECIALIZE insert :: [(BS.ByteString, Int)] -> HM.HashMap BS.ByteString Int + -> HM.HashMap BS.ByteString Int #-} + +delete :: (Eq k, Hashable k) => [k] -> HM.HashMap k Int -> HM.HashMap k Int +delete xs m0 = foldl' (\m k -> HM.delete k m) m0 xs +{-# SPECIALIZE delete :: [Int] -> HM.HashMap Int Int -> HM.HashMap Int Int #-} +{-# SPECIALIZE delete :: [String] -> HM.HashMap String Int + -> HM.HashMap String Int #-} +{-# SPECIALIZE delete :: [BS.ByteString] -> HM.HashMap BS.ByteString Int + -> HM.HashMap BS.ByteString Int #-} + +alterInsert :: (Eq k, Hashable k) => [(k, Int)] -> HM.HashMap k Int + -> HM.HashMap k Int +alterInsert xs m0 = + foldl' (\m (k, v) -> HM.alter (const . Just $ v) k m) m0 xs +{-# SPECIALIZE alterInsert :: [(Int, Int)] -> HM.HashMap Int Int + -> HM.HashMap Int Int #-} +{-# SPECIALIZE alterInsert :: [(String, Int)] -> HM.HashMap String Int + -> HM.HashMap String Int #-} +{-# SPECIALIZE alterInsert :: [(BS.ByteString, Int)] -> HM.HashMap BS.ByteString Int + -> HM.HashMap BS.ByteString Int #-} + +alterDelete :: (Eq k, Hashable k) => [k] -> HM.HashMap k Int + -> HM.HashMap k Int +alterDelete xs m0 = + foldl' (\m k -> HM.alter (const Nothing) k m) m0 xs +{-# SPECIALIZE alterDelete :: [Int] -> HM.HashMap Int Int + -> HM.HashMap Int Int #-} +{-# SPECIALIZE alterDelete :: [String] -> HM.HashMap String Int + -> HM.HashMap String Int #-} +{-# SPECIALIZE alterDelete :: [BS.ByteString] -> HM.HashMap BS.ByteString Int + -> HM.HashMap BS.ByteString Int #-} + +alterFInsert :: (Eq k, Hashable k) => [(k, Int)] -> HM.HashMap k Int + -> HM.HashMap k Int +alterFInsert xs m0 = + foldl' (\m (k, v) -> runIdentity $ HM.alterF (const . Identity . Just $ v) k m) m0 xs +{-# SPECIALIZE alterFInsert :: [(Int, Int)] -> HM.HashMap Int Int + -> HM.HashMap Int Int #-} +{-# SPECIALIZE alterFInsert :: [(String, Int)] -> HM.HashMap String Int + -> HM.HashMap String Int #-} +{-# SPECIALIZE alterFInsert :: [(BS.ByteString, Int)] -> HM.HashMap BS.ByteString Int + -> HM.HashMap BS.ByteString Int #-} + +alterFDelete :: (Eq k, Hashable k) => [k] -> HM.HashMap k Int + -> HM.HashMap k Int +alterFDelete xs m0 = + foldl' (\m k -> runIdentity $ HM.alterF (const . Identity $ Nothing) k m) m0 xs +{-# SPECIALIZE alterFDelete :: [Int] -> HM.HashMap Int Int + -> HM.HashMap Int Int #-} +{-# SPECIALIZE alterFDelete :: [String] -> HM.HashMap String Int + -> HM.HashMap String Int #-} +{-# SPECIALIZE alterFDelete :: [BS.ByteString] -> HM.HashMap BS.ByteString Int + -> HM.HashMap BS.ByteString Int #-} + +isSubmapOfNaive :: (Eq k, Hashable k) => HM.HashMap k Int -> HM.HashMap k Int -> Bool +isSubmapOfNaive m1 m2 = and [ Just v1 == HM.lookup k1 m2 | (k1,v1) <- HM.toList m1 ] +{-# SPECIALIZE isSubmapOfNaive :: HM.HashMap Int Int -> HM.HashMap Int Int -> Bool #-} +{-# SPECIALIZE isSubmapOfNaive :: HM.HashMap String Int -> HM.HashMap String Int -> Bool #-} +{-# SPECIALIZE isSubmapOfNaive :: HM.HashMap BS.ByteString Int -> HM.HashMap BS.ByteString Int -> Bool #-} + +#ifdef BENCH_containers_Map +------------------------------------------------------------------------ +-- * Map + +lookupM :: Ord k => [k] -> M.Map k Int -> Int +lookupM xs m = foldl' (\z k -> fromMaybe z (M.lookup k m)) 0 xs +{-# SPECIALIZE lookupM :: [String] -> M.Map String Int -> Int #-} +{-# SPECIALIZE lookupM :: [BS.ByteString] -> M.Map BS.ByteString Int -> Int #-} + +insertM :: Ord k => [(k, Int)] -> M.Map k Int -> M.Map k Int +insertM xs m0 = foldl' (\m (k, v) -> M.insert k v m) m0 xs +{-# SPECIALIZE insertM :: [(String, Int)] -> M.Map String Int + -> M.Map String Int #-} +{-# SPECIALIZE insertM :: [(BS.ByteString, Int)] -> M.Map BS.ByteString Int + -> M.Map BS.ByteString Int #-} + +deleteM :: Ord k => [k] -> M.Map k Int -> M.Map k Int +deleteM xs m0 = foldl' (\m k -> M.delete k m) m0 xs +{-# SPECIALIZE deleteM :: [String] -> M.Map String Int -> M.Map String Int #-} +{-# SPECIALIZE deleteM :: [BS.ByteString] -> M.Map BS.ByteString Int + -> M.Map BS.ByteString Int #-} +#endif + +#ifdef BENCH_hashmap_Map +------------------------------------------------------------------------ +-- * Map from the hashmap package + +lookupIHM :: (Eq k, Hashable k, Ord k) => [k] -> IHM.Map k Int -> Int +lookupIHM xs m = foldl' (\z k -> fromMaybe z (IHM.lookup k m)) 0 xs +{-# SPECIALIZE lookupIHM :: [String] -> IHM.Map String Int -> Int #-} +{-# SPECIALIZE lookupIHM :: [BS.ByteString] -> IHM.Map BS.ByteString Int + -> Int #-} + +insertIHM :: (Eq k, Hashable k, Ord k) => [(k, Int)] -> IHM.Map k Int + -> IHM.Map k Int +insertIHM xs m0 = foldl' (\m (k, v) -> IHM.insert k v m) m0 xs +{-# SPECIALIZE insertIHM :: [(String, Int)] -> IHM.Map String Int + -> IHM.Map String Int #-} +{-# SPECIALIZE insertIHM :: [(BS.ByteString, Int)] -> IHM.Map BS.ByteString Int + -> IHM.Map BS.ByteString Int #-} + +deleteIHM :: (Eq k, Hashable k, Ord k) => [k] -> IHM.Map k Int -> IHM.Map k Int +deleteIHM xs m0 = foldl' (\m k -> IHM.delete k m) m0 xs +{-# SPECIALIZE deleteIHM :: [String] -> IHM.Map String Int + -> IHM.Map String Int #-} +{-# SPECIALIZE deleteIHM :: [BS.ByteString] -> IHM.Map BS.ByteString Int + -> IHM.Map BS.ByteString Int #-} +#endif + +#ifdef BENCH_containers_IntMap +------------------------------------------------------------------------ +-- * IntMap + +lookupIM :: [Int] -> IM.IntMap Int -> Int +lookupIM xs m = foldl' (\z k -> fromMaybe z (IM.lookup k m)) 0 xs + +insertIM :: [(Int, Int)] -> IM.IntMap Int -> IM.IntMap Int +insertIM xs m0 = foldl' (\m (k, v) -> IM.insert k v m) m0 xs + +deleteIM :: [Int] -> IM.IntMap Int -> IM.IntMap Int +deleteIM xs m0 = foldl' (\m k -> IM.delete k m) m0 xs +#endif diff --git a/benchmarks/Util/ByteString.hs b/benchmarks/Util/ByteString.hs new file mode 100644 index 0000000..45eb9aa --- /dev/null +++ b/benchmarks/Util/ByteString.hs @@ -0,0 +1,28 @@ +-- | Benchmarking utilities. For example, functions for generating +-- random 'ByteString's. +module Util.ByteString where + +import qualified Data.ByteString as S +import qualified Data.ByteString.Char8 as C +import qualified Util.String as String + +-- | Generate a number of fixed length 'ByteString's where the content +-- of the strings are letters in ascending order. +asc :: Int -- ^ Length of each string + -> Int -- ^ Number of strings + -> [S.ByteString] +asc strlen num = map C.pack $ String.asc strlen num + +-- | Generate a number of fixed length 'ByteString's where the content +-- of the strings are letters in random order. +rnd :: Int -- ^ Length of each string + -> Int -- ^ Number of strings + -> [S.ByteString] +rnd strlen num = map C.pack $ String.rnd strlen num + +-- | Generate a number of fixed length 'ByteString's where the content +-- of the strings are letters in random order, different from @rnd@. +rnd' :: Int -- ^ Length of each string + -> Int -- ^ Number of strings + -> [S.ByteString] +rnd' strlen num = map C.pack $ String.rnd' strlen num diff --git a/benchmarks/Util/Int.hs b/benchmarks/Util/Int.hs new file mode 100644 index 0000000..0a44a14 --- /dev/null +++ b/benchmarks/Util/Int.hs @@ -0,0 +1,19 @@ +-- | Benchmarking utilities. For example, functions for generating +-- random integers. +module Util.Int where + +import System.Random (mkStdGen, randomRs) + +-- | Generate a number of uniform random integers in the interval +-- @[0..upper]@. +rnd :: Int -- ^ Upper bound (inclusive) + -> Int -- ^ Number of integers + -> [Int] +rnd upper num = take num $ randomRs (0, upper) $ mkStdGen 1234 + +-- | Generate a number of uniform random integers in the interval +-- @[0..upper]@ different from @rnd@. +rnd' :: Int -- ^ Upper bound (inclusive) + -> Int -- ^ Number of integers + -> [Int] +rnd' upper num = take num $ randomRs (0, upper) $ mkStdGen 5678 diff --git a/benchmarks/Util/String.hs b/benchmarks/Util/String.hs new file mode 100644 index 0000000..c649adf --- /dev/null +++ b/benchmarks/Util/String.hs @@ -0,0 +1,34 @@ +-- | Benchmarking utilities. For example, functions for generating +-- random strings. +module Util.String where + +import System.Random (mkStdGen, randomRs) + +-- | Generate a number of fixed length strings where the content of +-- the strings are letters in ascending order. +asc :: Int -- ^ Length of each string + -> Int -- ^ Number of strings + -> [String] +asc strlen num = take num $ iterate (snd . inc) $ replicate strlen 'a' + where inc [] = (True, []) + inc (c:cs) = case inc cs of (True, cs') | c == 'z' -> (True, 'a' : cs') + | otherwise -> (False, succ c : cs') + (False, cs') -> (False, c : cs') + +-- | Generate a number of fixed length strings where the content of +-- the strings are letters in random order. +rnd :: Int -- ^ Length of each string + -> Int -- ^ Number of strings + -> [String] +rnd strlen num = take num $ split $ randomRs ('a', 'z') $ mkStdGen 1234 + where + split cs = case splitAt strlen cs of (str, cs') -> str : split cs' + +-- | Generate a number of fixed length strings where the content of +-- the strings are letters in random order, different from rnd +rnd' :: Int -- ^ Length of each string + -> Int -- ^ Number of strings + -> [String] +rnd' strlen num = take num $ split $ randomRs ('a', 'z') $ mkStdGen 5678 + where + split cs = case splitAt strlen cs of (str, cs') -> str : split cs' diff --git a/tests/Main.hs b/tests/Main.hs new file mode 100644 index 0000000..9e337ad --- /dev/null +++ b/tests/Main.hs @@ -0,0 +1,14 @@ +module Main (main) where + +import Test.Tasty (defaultMain, testGroup) + +import qualified Properties +import qualified Regressions +import qualified Strictness + +main :: IO () +main = defaultMain $ testGroup "All" + [ Properties.tests + , Regressions.tests + , Strictness.tests + ] diff --git a/tests/Properties.hs b/tests/Properties.hs new file mode 100644 index 0000000..01acc42 --- /dev/null +++ b/tests/Properties.hs @@ -0,0 +1,16 @@ +module Properties (tests) where + +import Test.Tasty (TestTree, testGroup) + +import qualified Properties.HashMapLazy +import qualified Properties.HashMapStrict +import qualified Properties.HashSet +import qualified Properties.List + +tests :: TestTree +tests = testGroup "Properties" + [ Properties.HashMapLazy.tests + , Properties.HashMapStrict.tests + , Properties.HashSet.tests + , Properties.List.tests + ] diff --git a/tests/Properties/HashMapLazy.hs b/tests/Properties/HashMapLazy.hs new file mode 100644 index 0000000..86f0ceb --- /dev/null +++ b/tests/Properties/HashMapLazy.hs @@ -0,0 +1,593 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} +{-# OPTIONS_GHC -fno-warn-orphans #-} -- because of Arbitrary (HashMap k v) + +-- | Tests for the 'Data.HashMap.Lazy' module. We test functions by +-- comparing them to @Map@ from @containers@. + +#if defined(STRICT) +#define MODULE_NAME Properties.HashMapStrict +#else +#define MODULE_NAME Properties.HashMapLazy +#endif + +module MODULE_NAME (tests) where + +import Control.Applicative (Const (..)) +import Control.Monad (guard) +import Data.Bifoldable +import Data.Function (on) +import Data.Functor.Identity (Identity (..)) +import Data.Hashable (Hashable (hashWithSalt)) +import Data.Ord (comparing) +import Test.QuickCheck (Arbitrary (..), Property, elements, forAll, + (===), (==>)) +import Test.QuickCheck.Function (Fun, apply) +import Test.QuickCheck.Poly (A, B) +import Test.Tasty (TestTree, testGroup) +import Test.Tasty.QuickCheck (testProperty) + +import qualified Data.Foldable as Foldable +import qualified Data.List as List + +#if defined(STRICT) +import Data.HashMap.Strict (HashMap) +import qualified Data.HashMap.Strict as HM +import qualified Data.Map.Strict as M +#else +import Data.HashMap.Lazy (HashMap) +import qualified Data.HashMap.Lazy as HM +import qualified Data.Map.Lazy as M +#endif + +-- Key type that generates more hash collisions. +newtype Key = K { unK :: Int } + deriving (Arbitrary, Eq, Ord, Read, Show, Num) + +instance Hashable Key where + hashWithSalt salt k = hashWithSalt salt (unK k) `mod` 20 + +instance (Eq k, Hashable k, Arbitrary k, Arbitrary v) => Arbitrary (HashMap k v) where + arbitrary = fmap (HM.fromList) arbitrary + +------------------------------------------------------------------------ +-- * Properties + +------------------------------------------------------------------------ +-- ** Instances + +pEq :: [(Key, Int)] -> [(Key, Int)] -> Bool +pEq xs = (M.fromList xs ==) `eq` (HM.fromList xs ==) + +pNeq :: [(Key, Int)] -> [(Key, Int)] -> Bool +pNeq xs = (M.fromList xs /=) `eq` (HM.fromList xs /=) + +-- We cannot compare to `Data.Map` as ordering is different. +pOrd1 :: [(Key, Int)] -> Bool +pOrd1 xs = compare x x == EQ + where + x = HM.fromList xs + +pOrd2 :: [(Key, Int)] -> [(Key, Int)] -> [(Key, Int)] -> Bool +pOrd2 xs ys zs = case (compare x y, compare y z) of + (EQ, o) -> compare x z == o + (o, EQ) -> compare x z == o + (LT, LT) -> compare x z == LT + (GT, GT) -> compare x z == GT + (LT, GT) -> True -- ys greater than xs and zs. + (GT, LT) -> True + where + x = HM.fromList xs + y = HM.fromList ys + z = HM.fromList zs + +pOrd3 :: [(Key, Int)] -> [(Key, Int)] -> Bool +pOrd3 xs ys = case (compare x y, compare y x) of + (EQ, EQ) -> True + (LT, GT) -> True + (GT, LT) -> True + _ -> False + where + x = HM.fromList xs + y = HM.fromList ys + +pOrdEq :: [(Key, Int)] -> [(Key, Int)] -> Bool +pOrdEq xs ys = case (compare x y, x == y) of + (EQ, True) -> True + (LT, False) -> True + (GT, False) -> True + _ -> False + where + x = HM.fromList xs + y = HM.fromList ys + +pReadShow :: [(Key, Int)] -> Bool +pReadShow xs = M.fromList xs == read (show (M.fromList xs)) + +pFunctor :: [(Key, Int)] -> Bool +pFunctor = fmap (+ 1) `eq_` fmap (+ 1) + +pFoldable :: [(Int, Int)] -> Bool +pFoldable = (List.sort . Foldable.foldr (:) []) `eq` + (List.sort . Foldable.foldr (:) []) + +pHashable :: [(Key, Int)] -> [Int] -> Int -> Property +pHashable xs is salt = + x == y ==> hashWithSalt salt x === hashWithSalt salt y + where + xs' = List.nubBy (\(k,_) (k',_) -> k == k') xs + ys = shuffle is xs' + x = HM.fromList xs' + y = HM.fromList ys + -- Shuffle the list using indexes in the second + shuffle :: [Int] -> [a] -> [a] + shuffle idxs = List.map snd + . List.sortBy (comparing fst) + . List.zip (idxs ++ [List.maximum (0:is) + 1 ..]) + +------------------------------------------------------------------------ +-- ** Basic interface + +pSize :: [(Key, Int)] -> Bool +pSize = M.size `eq` HM.size + +pMember :: Key -> [(Key, Int)] -> Bool +pMember k = M.member k `eq` HM.member k + +pLookup :: Key -> [(Key, Int)] -> Bool +pLookup k = M.lookup k `eq` HM.lookup k + +pLookupOperator :: Key -> [(Key, Int)] -> Bool +pLookupOperator k = M.lookup k `eq` (HM.!? k) + +pInsert :: Key -> Int -> [(Key, Int)] -> Bool +pInsert k v = M.insert k v `eq_` HM.insert k v + +pDelete :: Key -> [(Key, Int)] -> Bool +pDelete k = M.delete k `eq_` HM.delete k + +newtype AlwaysCollide = AC Int + deriving (Arbitrary, Eq, Ord, Show) + +instance Hashable AlwaysCollide where + hashWithSalt _ _ = 1 + +-- White-box test that tests the case of deleting one of two keys from +-- a map, where the keys' hash values collide. +pDeleteCollision :: AlwaysCollide -> AlwaysCollide -> AlwaysCollide -> Int + -> Property +pDeleteCollision k1 k2 k3 idx = (k1 /= k2) && (k2 /= k3) && (k1 /= k3) ==> + HM.member toKeep $ HM.delete toDelete $ + HM.fromList [(k1, 1 :: Int), (k2, 2), (k3, 3)] + where + which = idx `mod` 3 + toDelete + | which == 0 = k1 + | which == 1 = k2 + | which == 2 = k3 + | otherwise = error "Impossible" + toKeep + | which == 0 = k2 + | which == 1 = k3 + | which == 2 = k1 + | otherwise = error "Impossible" + +pInsertWith :: Key -> [(Key, Int)] -> Bool +pInsertWith k = M.insertWith (+) k 1 `eq_` HM.insertWith (+) k 1 + +pAdjust :: Key -> [(Key, Int)] -> Bool +pAdjust k = M.adjust succ k `eq_` HM.adjust succ k + +pUpdateAdjust :: Key -> [(Key, Int)] -> Bool +pUpdateAdjust k = M.update (Just . succ) k `eq_` HM.update (Just . succ) k + +pUpdateDelete :: Key -> [(Key, Int)] -> Bool +pUpdateDelete k = M.update (const Nothing) k `eq_` HM.update (const Nothing) k + +pAlterAdjust :: Key -> [(Key, Int)] -> Bool +pAlterAdjust k = M.alter (fmap succ) k `eq_` HM.alter (fmap succ) k + +pAlterInsert :: Key -> [(Key, Int)] -> Bool +pAlterInsert k = M.alter (const $ Just 3) k `eq_` HM.alter (const $ Just 3) k + +pAlterDelete :: Key -> [(Key, Int)] -> Bool +pAlterDelete k = M.alter (const Nothing) k `eq_` HM.alter (const Nothing) k + + +-- We choose the list functor here because we don't fuss with +-- it in alterF rules and because it has a sufficiently interesting +-- structure to have a good chance of breaking if something is wrong. +pAlterF :: Key -> Fun (Maybe A) [Maybe A] -> [(Key, A)] -> Property +pAlterF k f xs = + fmap M.toAscList (M.alterF (apply f) k (M.fromList xs)) + === + fmap toAscList (HM.alterF (apply f) k (HM.fromList xs)) + +pAlterFAdjust :: Key -> [(Key, Int)] -> Bool +pAlterFAdjust k = + runIdentity . M.alterF (Identity . fmap succ) k `eq_` + runIdentity . HM.alterF (Identity . fmap succ) k + +pAlterFInsert :: Key -> [(Key, Int)] -> Bool +pAlterFInsert k = + runIdentity . M.alterF (const . Identity . Just $ 3) k `eq_` + runIdentity . HM.alterF (const . Identity . Just $ 3) k + +pAlterFInsertWith :: Key -> Fun Int Int -> [(Key, Int)] -> Bool +pAlterFInsertWith k f = + runIdentity . M.alterF (Identity . Just . maybe 3 (apply f)) k `eq_` + runIdentity . HM.alterF (Identity . Just . maybe 3 (apply f)) k + +pAlterFDelete :: Key -> [(Key, Int)] -> Bool +pAlterFDelete k = + runIdentity . M.alterF (const (Identity Nothing)) k `eq_` + runIdentity . HM.alterF (const (Identity Nothing)) k + +pAlterFLookup :: Key + -> Fun (Maybe A) B + -> [(Key, A)] -> Bool +pAlterFLookup k f = + getConst . M.alterF (Const . apply f :: Maybe A -> Const B (Maybe A)) k + `eq` + getConst . HM.alterF (Const . apply f) k + +pSubmap :: [(Key, Int)] -> [(Key, Int)] -> Bool +pSubmap xs ys = M.isSubmapOf (M.fromList xs) (M.fromList ys) == + HM.isSubmapOf (HM.fromList xs) (HM.fromList ys) + +pSubmapReflexive :: HashMap Key Int -> Bool +pSubmapReflexive m = HM.isSubmapOf m m + +pSubmapUnion :: HashMap Key Int -> HashMap Key Int -> Bool +pSubmapUnion m1 m2 = HM.isSubmapOf m1 (HM.union m1 m2) + +pNotSubmapUnion :: HashMap Key Int -> HashMap Key Int -> Property +pNotSubmapUnion m1 m2 = not (HM.isSubmapOf m1 m2) ==> HM.isSubmapOf m1 (HM.union m1 m2) + +pSubmapDifference :: HashMap Key Int -> HashMap Key Int -> Bool +pSubmapDifference m1 m2 = HM.isSubmapOf (HM.difference m1 m2) m1 + +pNotSubmapDifference :: HashMap Key Int -> HashMap Key Int -> Property +pNotSubmapDifference m1 m2 = + not (HM.null (HM.intersection m1 m2)) ==> + not (HM.isSubmapOf m1 (HM.difference m1 m2)) + +pSubmapDelete :: HashMap Key Int -> Property +pSubmapDelete m = not (HM.null m) ==> + forAll (elements (HM.keys m)) $ \k -> + HM.isSubmapOf (HM.delete k m) m + +pNotSubmapDelete :: HashMap Key Int -> Property +pNotSubmapDelete m = + not (HM.null m) ==> + forAll (elements (HM.keys m)) $ \k -> + not (HM.isSubmapOf m (HM.delete k m)) + +pSubmapInsert :: Key -> Int -> HashMap Key Int -> Property +pSubmapInsert k v m = not (HM.member k m) ==> HM.isSubmapOf m (HM.insert k v m) + +pNotSubmapInsert :: Key -> Int -> HashMap Key Int -> Property +pNotSubmapInsert k v m = not (HM.member k m) ==> not (HM.isSubmapOf (HM.insert k v m) m) + +------------------------------------------------------------------------ +-- ** Combine + +pUnion :: [(Key, Int)] -> [(Key, Int)] -> Bool +pUnion xs ys = M.union (M.fromList xs) `eq_` HM.union (HM.fromList xs) $ ys + +pUnionWith :: [(Key, Int)] -> [(Key, Int)] -> Bool +pUnionWith xs ys = M.unionWith (-) (M.fromList xs) `eq_` + HM.unionWith (-) (HM.fromList xs) $ ys + +pUnionWithKey :: [(Key, Int)] -> [(Key, Int)] -> Bool +pUnionWithKey xs ys = M.unionWithKey go (M.fromList xs) `eq_` + HM.unionWithKey go (HM.fromList xs) $ ys + where + go :: Key -> Int -> Int -> Int + go (K k) i1 i2 = k - i1 + i2 + +pUnions :: [[(Key, Int)]] -> Bool +pUnions xss = M.toAscList (M.unions (map M.fromList xss)) == + toAscList (HM.unions (map HM.fromList xss)) + +------------------------------------------------------------------------ +-- ** Transformations + +pMap :: [(Key, Int)] -> Bool +pMap = M.map (+ 1) `eq_` HM.map (+ 1) + +pTraverse :: [(Key, Int)] -> Bool +pTraverse xs = + List.sort (fmap (List.sort . M.toList) (M.traverseWithKey (\_ v -> [v + 1, v + 2]) (M.fromList (take 10 xs)))) + == List.sort (fmap (List.sort . HM.toList) (HM.traverseWithKey (\_ v -> [v + 1, v + 2]) (HM.fromList (take 10 xs)))) + +pMapKeys :: [(Int, Int)] -> Bool +pMapKeys = M.mapKeys (+1) `eq_` HM.mapKeys (+1) + +------------------------------------------------------------------------ +-- ** Difference and intersection + +pDifference :: [(Key, Int)] -> [(Key, Int)] -> Bool +pDifference xs ys = M.difference (M.fromList xs) `eq_` + HM.difference (HM.fromList xs) $ ys + +pDifferenceWith :: [(Key, Int)] -> [(Key, Int)] -> Bool +pDifferenceWith xs ys = M.differenceWith f (M.fromList xs) `eq_` + HM.differenceWith f (HM.fromList xs) $ ys + where + f x y = if x == 0 then Nothing else Just (x - y) + +pIntersection :: [(Key, Int)] -> [(Key, Int)] -> Bool +pIntersection xs ys = + M.intersection (M.fromList xs) + `eq_` HM.intersection (HM.fromList xs) + $ ys + +pIntersectionWith :: [(Key, Int)] -> [(Key, Int)] -> Bool +pIntersectionWith xs ys = M.intersectionWith (-) (M.fromList xs) `eq_` + HM.intersectionWith (-) (HM.fromList xs) $ ys + +pIntersectionWithKey :: [(Key, Int)] -> [(Key, Int)] -> Bool +pIntersectionWithKey xs ys = M.intersectionWithKey go (M.fromList xs) `eq_` + HM.intersectionWithKey go (HM.fromList xs) $ ys + where + go :: Key -> Int -> Int -> Int + go (K k) i1 i2 = k - i1 - i2 + +------------------------------------------------------------------------ +-- ** Folds + +pFoldr :: [(Int, Int)] -> Bool +pFoldr = (List.sort . M.foldr (:) []) `eq` (List.sort . HM.foldr (:) []) + +pFoldl :: [(Int, Int)] -> Bool +pFoldl = (List.sort . M.foldl (flip (:)) []) `eq` (List.sort . HM.foldl (flip (:)) []) + +pBifoldMap :: [(Int, Int)] -> Bool +pBifoldMap xs = concatMap f (HM.toList m) == bifoldMap (:[]) (:[]) m + where f (k, v) = [k, v] + m = HM.fromList xs + +pBifoldr :: [(Int, Int)] -> Bool +pBifoldr xs = concatMap f (HM.toList m) == bifoldr (:) (:) [] m + where f (k, v) = [k, v] + m = HM.fromList xs + +pBifoldl :: [(Int, Int)] -> Bool +pBifoldl xs = reverse (concatMap f $ HM.toList m) == bifoldl (flip (:)) (flip (:)) [] m + where f (k, v) = [k, v] + m = HM.fromList xs + +pFoldrWithKey :: [(Int, Int)] -> Bool +pFoldrWithKey = (sortByKey . M.foldrWithKey f []) `eq` + (sortByKey . HM.foldrWithKey f []) + where f k v z = (k, v) : z + +pFoldMapWithKey :: [(Int, Int)] -> Bool +pFoldMapWithKey = (sortByKey . M.foldMapWithKey f) `eq` + (sortByKey . HM.foldMapWithKey f) + where f k v = [(k, v)] + +pFoldrWithKey' :: [(Int, Int)] -> Bool +pFoldrWithKey' = (sortByKey . M.foldrWithKey' f []) `eq` + (sortByKey . HM.foldrWithKey' f []) + where f k v z = (k, v) : z + +pFoldlWithKey :: [(Int, Int)] -> Bool +pFoldlWithKey = (sortByKey . M.foldlWithKey f []) `eq` + (sortByKey . HM.foldlWithKey f []) + where f z k v = (k, v) : z + +pFoldlWithKey' :: [(Int, Int)] -> Bool +pFoldlWithKey' = (sortByKey . M.foldlWithKey' f []) `eq` + (sortByKey . HM.foldlWithKey' f []) + where f z k v = (k, v) : z + +pFoldl' :: [(Int, Int)] -> Bool +pFoldl' = (List.sort . M.foldl' (flip (:)) []) `eq` (List.sort . HM.foldl' (flip (:)) []) + +pFoldr' :: [(Int, Int)] -> Bool +pFoldr' = (List.sort . M.foldr' (:) []) `eq` (List.sort . HM.foldr' (:) []) + +------------------------------------------------------------------------ +-- ** Filter + +pMapMaybeWithKey :: [(Key, Int)] -> Bool +pMapMaybeWithKey = M.mapMaybeWithKey f `eq_` HM.mapMaybeWithKey f + where f k v = guard (odd (unK k + v)) >> Just (v + 1) + +pMapMaybe :: [(Key, Int)] -> Bool +pMapMaybe = M.mapMaybe f `eq_` HM.mapMaybe f + where f v = guard (odd v) >> Just (v + 1) + +pFilter :: [(Key, Int)] -> Bool +pFilter = M.filter odd `eq_` HM.filter odd + +pFilterWithKey :: [(Key, Int)] -> Bool +pFilterWithKey = M.filterWithKey p `eq_` HM.filterWithKey p + where p k v = odd (unK k + v) + +------------------------------------------------------------------------ +-- ** Conversions + +-- The free magma is used to test that operations are applied in the +-- same order. +data Magma a + = Leaf a + | Op (Magma a) (Magma a) + deriving (Show, Eq, Ord) + +instance Hashable a => Hashable (Magma a) where + hashWithSalt s (Leaf a) = hashWithSalt s (hashWithSalt (1::Int) a) + hashWithSalt s (Op m n) = hashWithSalt s (hashWithSalt (hashWithSalt (2::Int) m) n) + +-- 'eq_' already calls fromList. +pFromList :: [(Key, Int)] -> Bool +pFromList = id `eq_` id + +pFromListWith :: [(Key, Int)] -> Bool +pFromListWith kvs = (M.toAscList $ M.fromListWith Op kvsM) == + (toAscList $ HM.fromListWith Op kvsM) + where kvsM = fmap (fmap Leaf) kvs + +pFromListWithKey :: [(Key, Int)] -> Bool +pFromListWithKey kvs = (M.toAscList $ M.fromListWithKey combine kvsM) == + (toAscList $ HM.fromListWithKey combine kvsM) + where kvsM = fmap (\(K k,v) -> (Leaf k, Leaf v)) kvs + combine k v1 v2 = Op k (Op v1 v2) + +pToList :: [(Key, Int)] -> Bool +pToList = M.toAscList `eq` toAscList + +pElems :: [(Key, Int)] -> Bool +pElems = (List.sort . M.elems) `eq` (List.sort . HM.elems) + +pKeys :: [(Key, Int)] -> Bool +pKeys = (List.sort . M.keys) `eq` (List.sort . HM.keys) + +------------------------------------------------------------------------ +-- * Test list + +tests :: TestTree +tests = + testGroup +#if defined(STRICT) + "Data.HashMap.Strict" +#else + "Data.HashMap.Lazy" +#endif + [ + -- Instances + testGroup "instances" + [ testProperty "==" pEq + , testProperty "/=" pNeq + , testProperty "compare reflexive" pOrd1 + , testProperty "compare transitive" pOrd2 + , testProperty "compare antisymmetric" pOrd3 + , testProperty "Ord => Eq" pOrdEq + , testProperty "Read/Show" pReadShow + , testProperty "Functor" pFunctor + , testProperty "Foldable" pFoldable + , testProperty "Hashable" pHashable + ] + -- Basic interface + , testGroup "basic interface" + [ testProperty "size" pSize + , testProperty "member" pMember + , testProperty "lookup" pLookup + , testProperty "!?" pLookupOperator + , testProperty "insert" pInsert + , testProperty "delete" pDelete + , testProperty "deleteCollision" pDeleteCollision + , testProperty "insertWith" pInsertWith + , testProperty "adjust" pAdjust + , testProperty "updateAdjust" pUpdateAdjust + , testProperty "updateDelete" pUpdateDelete + , testProperty "alterAdjust" pAlterAdjust + , testProperty "alterInsert" pAlterInsert + , testProperty "alterDelete" pAlterDelete + , testProperty "alterF" pAlterF + , testProperty "alterFAdjust" pAlterFAdjust + , testProperty "alterFInsert" pAlterFInsert + , testProperty "alterFInsertWith" pAlterFInsertWith + , testProperty "alterFDelete" pAlterFDelete + , testProperty "alterFLookup" pAlterFLookup + , testGroup "isSubmapOf" + [ testProperty "container compatibility" pSubmap + , testProperty "m ⊆ m" pSubmapReflexive + , testProperty "m1 ⊆ m1 ∪ m2" pSubmapUnion + , testProperty "m1 ⊈ m2 ⇒ m1 ∪ m2 ⊈ m1" pNotSubmapUnion + , testProperty "m1\\m2 ⊆ m1" pSubmapDifference + , testProperty "m1 ∩ m2 ≠ ∅ ⇒ m1 ⊈ m1\\m2 " pNotSubmapDifference + , testProperty "delete k m ⊆ m" pSubmapDelete + , testProperty "m ⊈ delete k m " pNotSubmapDelete + , testProperty "k ∉ m ⇒ m ⊆ insert k v m" pSubmapInsert + , testProperty "k ∉ m ⇒ insert k v m ⊈ m" pNotSubmapInsert + ] + ] + -- Combine + , testProperty "union" pUnion + , testProperty "unionWith" pUnionWith + , testProperty "unionWithKey" pUnionWithKey + , testProperty "unions" pUnions + -- Transformations + , testProperty "map" pMap + , testProperty "traverse" pTraverse + , testProperty "mapKeys" pMapKeys + -- Folds + , testGroup "folds" + [ testProperty "foldr" pFoldr + , testProperty "foldl" pFoldl + , testProperty "bifoldMap" pBifoldMap + , testProperty "bifoldr" pBifoldr + , testProperty "bifoldl" pBifoldl + , testProperty "foldrWithKey" pFoldrWithKey + , testProperty "foldlWithKey" pFoldlWithKey + , testProperty "foldrWithKey'" pFoldrWithKey' + , testProperty "foldlWithKey'" pFoldlWithKey' + , testProperty "foldl'" pFoldl' + , testProperty "foldr'" pFoldr' + , testProperty "foldMapWithKey" pFoldMapWithKey + ] + , testGroup "difference and intersection" + [ testProperty "difference" pDifference + , testProperty "differenceWith" pDifferenceWith + , testProperty "intersection" pIntersection + , testProperty "intersectionWith" pIntersectionWith + , testProperty "intersectionWithKey" pIntersectionWithKey + ] + -- Filter + , testGroup "filter" + [ testProperty "filter" pFilter + , testProperty "filterWithKey" pFilterWithKey + , testProperty "mapMaybe" pMapMaybe + , testProperty "mapMaybeWithKey" pMapMaybeWithKey + ] + -- Conversions + , testGroup "conversions" + [ testProperty "elems" pElems + , testProperty "keys" pKeys + , testProperty "fromList" pFromList + , testProperty "fromListWith" pFromListWith + , testProperty "fromListWithKey" pFromListWithKey + , testProperty "toList" pToList + ] + ] + +------------------------------------------------------------------------ +-- * Model + +type Model k v = M.Map k v + +-- | Check that a function operating on a 'HashMap' is equivalent to +-- one operating on a 'Model'. +eq :: (Eq a, Eq k, Hashable k, Ord k) + => (Model k v -> a) -- ^ Function that modifies a 'Model' + -> (HM.HashMap k v -> a) -- ^ Function that modified a 'HashMap' in the same + -- way + -> [(k, v)] -- ^ Initial content of the 'HashMap' and 'Model' + -> Bool -- ^ True if the functions are equivalent +eq f g xs = g (HM.fromList xs) == f (M.fromList xs) + +infix 4 `eq` + +eq_ :: (Eq k, Eq v, Hashable k, Ord k) + => (Model k v -> Model k v) -- ^ Function that modifies a 'Model' + -> (HM.HashMap k v -> HM.HashMap k v) -- ^ Function that modified a + -- 'HashMap' in the same way + -> [(k, v)] -- ^ Initial content of the 'HashMap' + -- and 'Model' + -> Bool -- ^ True if the functions are + -- equivalent +eq_ f g = (M.toAscList . f) `eq` (toAscList . g) + +infix 4 `eq_` + +------------------------------------------------------------------------ +-- * Helpers + +sortByKey :: Ord k => [(k, v)] -> [(k, v)] +sortByKey = List.sortBy (compare `on` fst) + +toAscList :: Ord k => HM.HashMap k v -> [(k, v)] +toAscList = List.sortBy (compare `on` fst) . HM.toList diff --git a/tests/Properties/HashMapStrict.hs b/tests/Properties/HashMapStrict.hs new file mode 100644 index 0000000..238348d --- /dev/null +++ b/tests/Properties/HashMapStrict.hs @@ -0,0 +1,5 @@ +{-# LANGUAGE CPP #-} + +#define STRICT + +#include "HashMapLazy.hs" diff --git a/tests/Properties/HashSet.hs b/tests/Properties/HashSet.hs new file mode 100644 index 0000000..6af5d5f --- /dev/null +++ b/tests/Properties/HashSet.hs @@ -0,0 +1,236 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} + +-- | Tests for the 'Data.HashSet' module. We test functions by +-- comparing them to @Set@ from @containers@. + +module Properties.HashSet (tests) where + +import Data.Hashable (Hashable (hashWithSalt)) +import Data.Ord (comparing) +import Test.QuickCheck (Arbitrary, Property, (===), (==>)) +import Test.Tasty (TestTree, testGroup) +import Test.Tasty.QuickCheck (testProperty) + +import qualified Data.Foldable as Foldable +import qualified Data.HashSet as S +import qualified Data.List as List +import qualified Data.Set as Set + +-- Key type that generates more hash collisions. +newtype Key = K { unK :: Int } + deriving (Arbitrary, Enum, Eq, Integral, Num, Ord, Read, Show, Real) + +instance Hashable Key where + hashWithSalt salt k = hashWithSalt salt (unK k) `mod` 20 + +------------------------------------------------------------------------ +-- * Properties + +------------------------------------------------------------------------ +-- ** Instances + +pEq :: [Key] -> [Key] -> Bool +pEq xs = (Set.fromList xs ==) `eq` (S.fromList xs ==) + +pNeq :: [Key] -> [Key] -> Bool +pNeq xs = (Set.fromList xs /=) `eq` (S.fromList xs /=) + +-- We cannot compare to `Data.Map` as ordering is different. +pOrd1 :: [Key] -> Bool +pOrd1 xs = compare x x == EQ + where + x = S.fromList xs + +pOrd2 :: [Key] -> [Key] -> [Key] -> Bool +pOrd2 xs ys zs = case (compare x y, compare y z) of + (EQ, o) -> compare x z == o + (o, EQ) -> compare x z == o + (LT, LT) -> compare x z == LT + (GT, GT) -> compare x z == GT + (LT, GT) -> True -- ys greater than xs and zs. + (GT, LT) -> True + where + x = S.fromList xs + y = S.fromList ys + z = S.fromList zs + +pOrd3 :: [Key] -> [Key] -> Bool +pOrd3 xs ys = case (compare x y, compare y x) of + (EQ, EQ) -> True + (LT, GT) -> True + (GT, LT) -> True + _ -> False + where + x = S.fromList xs + y = S.fromList ys + +pOrdEq :: [Key] -> [Key] -> Bool +pOrdEq xs ys = case (compare x y, x == y) of + (EQ, True) -> True + (LT, False) -> True + (GT, False) -> True + _ -> False + where + x = S.fromList xs + y = S.fromList ys + +pReadShow :: [Key] -> Bool +pReadShow xs = Set.fromList xs == read (show (Set.fromList xs)) + +pFoldable :: [Int] -> Bool +pFoldable = (List.sort . Foldable.foldr (:) []) `eq` + (List.sort . Foldable.foldr (:) []) + +pPermutationEq :: [Key] -> [Int] -> Bool +pPermutationEq xs is = S.fromList xs == S.fromList ys + where + ys = shuffle is xs + shuffle idxs = List.map snd + . List.sortBy (comparing fst) + . List.zip (idxs ++ [List.maximum (0:is) + 1 ..]) + +pHashable :: [Key] -> [Int] -> Int -> Property +pHashable xs is salt = + x == y ==> hashWithSalt salt x === hashWithSalt salt y + where + xs' = List.nub xs + ys = shuffle is xs' + x = S.fromList xs' + y = S.fromList ys + shuffle idxs = List.map snd + . List.sortBy (comparing fst) + . List.zip (idxs ++ [List.maximum (0:is) + 1 ..]) + +------------------------------------------------------------------------ +-- ** Basic interface + +pSize :: [Key] -> Bool +pSize = Set.size `eq` S.size + +pMember :: Key -> [Key] -> Bool +pMember k = Set.member k `eq` S.member k + +pInsert :: Key -> [Key] -> Bool +pInsert a = Set.insert a `eq_` S.insert a + +pDelete :: Key -> [Key] -> Bool +pDelete a = Set.delete a `eq_` S.delete a + +------------------------------------------------------------------------ +-- ** Combine + +pUnion :: [Key] -> [Key] -> Bool +pUnion xs ys = Set.union (Set.fromList xs) `eq_` + S.union (S.fromList xs) $ ys + +------------------------------------------------------------------------ +-- ** Transformations + +pMap :: [Key] -> Bool +pMap = Set.map (+ 1) `eq_` S.map (+ 1) + +------------------------------------------------------------------------ +-- ** Folds + +pFoldr :: [Int] -> Bool +pFoldr = (List.sort . foldrSet (:) []) `eq` + (List.sort . S.foldr (:) []) + +foldrSet :: (a -> b -> b) -> b -> Set.Set a -> b +foldrSet = Set.foldr + +pFoldl' :: Int -> [Int] -> Bool +pFoldl' z0 = foldl'Set (+) z0 `eq` S.foldl' (+) z0 + +foldl'Set :: (a -> b -> a) -> a -> Set.Set b -> a +foldl'Set = Set.foldl' + +------------------------------------------------------------------------ +-- ** Filter + +pFilter :: [Key] -> Bool +pFilter = Set.filter odd `eq_` S.filter odd + +------------------------------------------------------------------------ +-- ** Conversions + +pToList :: [Key] -> Bool +pToList = Set.toAscList `eq` toAscList + +------------------------------------------------------------------------ +-- * Test list + +tests :: TestTree +tests = testGroup "Data.HashSet" + [ + -- Instances + testGroup "instances" + [ testProperty "==" pEq + , testProperty "Permutation ==" pPermutationEq + , testProperty "/=" pNeq + , testProperty "compare reflexive" pOrd1 + , testProperty "compare transitive" pOrd2 + , testProperty "compare antisymmetric" pOrd3 + , testProperty "Ord => Eq" pOrdEq + , testProperty "Read/Show" pReadShow + , testProperty "Foldable" pFoldable + , testProperty "Hashable" pHashable + ] + -- Basic interface + , testGroup "basic interface" + [ testProperty "size" pSize + , testProperty "member" pMember + , testProperty "insert" pInsert + , testProperty "delete" pDelete + ] + -- Combine + , testProperty "union" pUnion + -- Transformations + , testProperty "map" pMap + -- Folds + , testGroup "folds" + [ testProperty "foldr" pFoldr + , testProperty "foldl'" pFoldl' + ] + -- Filter + , testGroup "filter" + [ testProperty "filter" pFilter + ] + -- Conversions + , testGroup "conversions" + [ testProperty "toList" pToList + ] + ] + +------------------------------------------------------------------------ +-- * Model + +-- Invariant: the list is sorted in ascending order, by key. +type Model a = Set.Set a + +-- | Check that a function operating on a 'HashMap' is equivalent to +-- one operating on a 'Model'. +eq :: (Eq a, Hashable a, Ord a, Eq b) + => (Model a -> b) -- ^ Function that modifies a 'Model' in the same + -- way + -> (S.HashSet a -> b) -- ^ Function that modified a 'HashSet' + -> [a] -- ^ Initial content of the 'HashSet' and 'Model' + -> Bool -- ^ True if the functions are equivalent +eq f g xs = g (S.fromList xs) == f (Set.fromList xs) + +eq_ :: (Eq a, Hashable a, Ord a) + => (Model a -> Model a) -- ^ Function that modifies a 'Model' + -> (S.HashSet a -> S.HashSet a) -- ^ Function that modified a + -- 'HashSet' in the same way + -> [a] -- ^ Initial content of the 'HashSet' + -- and 'Model' + -> Bool -- ^ True if the functions are + -- equivalent +eq_ f g = (Set.toAscList . f) `eq` (toAscList . g) + +------------------------------------------------------------------------ +-- * Helpers + +toAscList :: Ord a => S.HashSet a -> [a] +toAscList = List.sort . S.toList diff --git a/tests/Properties/List.hs b/tests/Properties/List.hs new file mode 100644 index 0000000..b429478 --- /dev/null +++ b/tests/Properties/List.hs @@ -0,0 +1,64 @@ +module Properties.List (tests) where + +import Data.HashMap.Internal.List +import Data.List (nub, sort, sortBy) +import Data.Ord (comparing) +import Test.QuickCheck (Property, property, (===), (==>)) +import Test.Tasty (TestTree, testGroup) +import Test.Tasty.QuickCheck (testProperty) + +tests :: TestTree +tests = testGroup "Data.HashMap.Internal.List" + [ testProperty "isPermutationBy" pIsPermutation + , testProperty "isPermutationBy of different length" pIsPermutationDiffLength + , testProperty "pUnorderedCompare" pUnorderedCompare + , testGroup "modelUnorderedCompare" + [ testProperty "reflexive" modelUnorderedCompareRefl + , testProperty "anti-symmetric" modelUnorderedCompareAntiSymm + , testProperty "transitive" modelUnorderedCompareTrans + ] + ] + +pIsPermutation :: [Char] -> [Int] -> Bool +pIsPermutation xs is = isPermutationBy (==) xs xs' + where + is' = nub is ++ [maximum (0:is) + 1 ..] + xs' = map fst . sortBy (comparing snd) $ zip xs is' + +pIsPermutationDiffLength :: [Int] -> [Int] -> Property +pIsPermutationDiffLength xs ys = + length xs /= length ys ==> isPermutationBy (==) xs ys === False + +-- | Homogenous version of 'unorderedCompare' +-- +-- *Compare smallest non-equal elements of the two lists*. +modelUnorderedCompare :: Ord a => [a] -> [a] -> Ordering +modelUnorderedCompare as bs = compare (sort as) (sort bs) + +modelUnorderedCompareRefl :: [Int] -> Property +modelUnorderedCompareRefl xs = modelUnorderedCompare xs xs === EQ + +modelUnorderedCompareAntiSymm :: [Int] -> [Int] -> Property +modelUnorderedCompareAntiSymm xs ys = case a of + EQ -> b === EQ + LT -> b === GT + GT -> b === LT + where + a = modelUnorderedCompare xs ys + b = modelUnorderedCompare ys xs + +modelUnorderedCompareTrans :: [Int] -> [Int] -> [Int] -> Property +modelUnorderedCompareTrans xs ys zs = + case (modelUnorderedCompare xs ys, modelUnorderedCompare ys zs) of + (EQ, yz) -> xz === yz + (xy, EQ) -> xz === xy + (LT, LT) -> xz === LT + (GT, GT) -> xz === GT + (LT, GT) -> property True + (GT, LT) -> property True + where + xz = modelUnorderedCompare xs zs + +pUnorderedCompare :: [Int] -> [Int] -> Property +pUnorderedCompare xs ys = + unorderedCompare compare xs ys === modelUnorderedCompare xs ys diff --git a/tests/Regressions.hs b/tests/Regressions.hs new file mode 100644 index 0000000..52af107 --- /dev/null +++ b/tests/Regressions.hs @@ -0,0 +1,295 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE MagicHash #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE TypeApplications #-} +{-# LANGUAGE UnboxedTuples #-} +module Regressions (tests) where + +import Control.Exception (evaluate) +import Control.Monad (replicateM) +import Data.Bits (shiftL) +import Data.Hashable (Hashable (..)) +import Data.List (delete) +import Data.Maybe (isJust, isNothing) +import GHC.Exts (touch#) +import GHC.IO (IO (..)) +import Numeric.Natural (Natural) +import System.Mem (performGC) +import System.Mem.Weak (deRefWeak, mkWeakPtr) +import System.Random (randomIO) +import Test.HUnit (Assertion, assert) +import Test.QuickCheck +import Test.Tasty (TestTree, testGroup) +import Test.Tasty.HUnit (testCase) +import Test.Tasty.QuickCheck (testProperty) + +import qualified Data.HashMap.Lazy as HML +import qualified Data.HashMap.Strict as HMS +import qualified Data.HashSet as HS + +#if MIN_VERSION_base(4,12,0) +-- nothunks requires base >= 4.12 +#define HAVE_NOTHUNKS +import qualified Data.Foldable as Foldable +import NoThunks.Class (noThunksInValues) +#endif + +issue32 :: Assertion +issue32 = assert $ isJust $ HMS.lookup 7 m' + where + ns = [0..16] :: [Int] + m = HMS.fromList (zip ns (repeat [])) + m' = HMS.delete 10 m + +------------------------------------------------------------------------ +-- Issue #39 + +-- First regression + +issue39 :: Assertion +issue39 = assert $ hm1 == hm2 + where + hm1 = HMS.fromList ([a, b] `zip` [1, 1 :: Int ..]) + hm2 = HMS.fromList ([b, a] `zip` [1, 1 :: Int ..]) + a = (1, -1) :: (Int, Int) + b = (-1, 1) :: (Int, Int) + +-- Second regression + +newtype Keys = Keys [Int] + deriving Show + +instance Arbitrary Keys where + arbitrary = sized $ \l -> do + pis <- replicateM (l+1) positiveInt + return (Keys $ prefixSum pis) + + shrink (Keys ls) = + let l = length ls + in if l == 1 + then [] + else [ Keys (dropAt i ls) | i <- [0..l-1] ] + +positiveInt :: Gen Int +positiveInt = (+1) . abs <$> arbitrary + +prefixSum :: [Int] -> [Int] +prefixSum = loop 0 + where + loop _ [] = [] + loop prefix (l:ls) = let n = l + prefix + in n : loop n ls + +dropAt :: Int -> [a] -> [a] +dropAt _ [] = [] +dropAt i (l:ls) | i == 0 = ls + | otherwise = l : dropAt (i-1) ls + +propEqAfterDelete :: Keys -> Bool +propEqAfterDelete (Keys keys) = + let keyMap = mapFromKeys keys + k = head keys + in HMS.delete k keyMap == mapFromKeys (delete k keys) + +mapFromKeys :: [Int] -> HMS.HashMap Int () +mapFromKeys keys = HMS.fromList (zip keys (repeat ())) + +------------------------------------------------------------------------ +-- Issue #254 + +-- Key type that always collides. +newtype KC = KC Int + deriving (Eq, Ord, Show) +instance Hashable KC where + hashWithSalt salt _ = salt + +touch :: a -> IO () +touch a = IO (\s -> (# touch# a s, () #)) + +-- We want to make sure that old values in the HashMap are evicted when new values are inserted, +-- even if they aren't evaluated. To do that, we use the WeakPtr trick described at +-- http://simonmar.github.io/posts/2018-06-20-Finding-fixing-space-leaks.html. +-- We insert a value named oldV into the HashMap, then insert over it, checking oldV is no longer reachable. +-- +-- To make the test robust, it's important that oldV isn't hoisted up to the top or shared. +-- To do that, we generate it randomly. +issue254Lazy :: Assertion +issue254Lazy = do + i :: Int <- randomIO + let oldV = error $ "Should not be evaluated: " ++ show i + weakV <- mkWeakPtr oldV Nothing -- add the ability to test whether oldV is alive + mp <- evaluate $ HML.insert (KC 1) (error "Should not be evaluated") $ HML.fromList [(KC 0, "1"), (KC 1, oldV)] + performGC + res <- deRefWeak weakV -- gives Just if oldV is still alive + touch mp -- makes sure that we didn't GC away the whole HashMap, just oldV + assert $ isNothing res + +-- Like issue254Lazy, but using strict HashMap +issue254Strict :: Assertion +issue254Strict = do + i :: Int <- randomIO + let oldV = show i + weakV <- mkWeakPtr oldV Nothing + mp <- evaluate $ HMS.insert (KC 1) "3" $ HMS.fromList [(KC 0, "1"), (KC 1, oldV)] + performGC + res <- deRefWeak weakV + touch mp + assert $ isNothing res + +------------------------------------------------------------------------ +-- Issue #379 + +#ifdef HAVE_NOTHUNKS + +issue379Union :: Assertion +issue379Union = do + let m0 = HMS.fromList [(KC 1, ()), (KC 2, ())] + let m1 = HMS.fromList [(KC 2, ()), (KC 3, ())] + let u = m0 `HMS.union` m1 + mThunkInfo <- noThunksInValues mempty (Foldable.toList u) + assert $ isNothing mThunkInfo + +issue379StrictUnionWith :: Assertion +issue379StrictUnionWith = do + let m0 = HMS.fromList [(KC 1, 10), (KC 2, 20 :: Int)] + let m1 = HMS.fromList [(KC 2, 20), (KC 3, 30)] + let u = HMS.unionWith (+) m0 m1 + mThunkInfo <- noThunksInValues mempty (Foldable.toList u) + assert $ isNothing mThunkInfo + +issue379StrictUnionWithKey :: Assertion +issue379StrictUnionWithKey = do + let m0 = HMS.fromList [(KC 1, 10), (KC 2, 20 :: Int)] + let m1 = HMS.fromList [(KC 2, 20), (KC 3, 30)] + let u = HMS.unionWithKey (\(KC i) v0 v1 -> i + v0 + v1) m0 m1 + mThunkInfo <- noThunksInValues mempty (Foldable.toList u) + assert $ isNothing mThunkInfo + +#endif + +-- Another key type that always collides. +-- +-- Note (sjakobi): The KC newtype of Int somehow can't be used to demonstrate +-- the space leak in issue379LazyUnionWith. This type does the trick. +newtype SC = SC String + deriving (Eq, Ord, Show) +instance Hashable SC where + hashWithSalt salt _ = salt + +issue379LazyUnionWith :: Assertion +issue379LazyUnionWith = do + i :: Int <- randomIO + let k = SC (show i) + weakK <- mkWeakPtr k Nothing -- add the ability to test whether k is alive + let f :: Int -> Int + f x = error ("Should not be evaluated " ++ show x) + let m = HML.fromList [(SC "1", f 1), (SC "2", f 2), (k, f 3)] + let u = HML.unionWith (+) m m + Just v <- evaluate $ HML.lookup k u + performGC + res <- deRefWeak weakK -- gives Just if k is still alive + touch v -- makes sure that we didn't GC away the combined value + assert $ isNothing res + +------------------------------------------------------------------------ +-- Issue #381 + +#ifdef HAVE_NOTHUNKS + +issue381mapMaybe :: Assertion +issue381mapMaybe = do + let m0 = HMS.fromList [(KC 1, 10), (KC 2, 20 :: Int)] + let m1 = HMS.mapMaybe (Just . (+ 1)) m0 + mThunkInfo <- noThunksInValues mempty (Foldable.toList m1) + assert $ isNothing mThunkInfo + +issue381mapMaybeWithKey :: Assertion +issue381mapMaybeWithKey = do + let m0 = HMS.fromList [(KC 1, 10), (KC 2, 20 :: Int)] + let m1 = HMS.mapMaybeWithKey (\(KC k) v -> Just (k + v)) m0 + mThunkInfo <- noThunksInValues mempty (Foldable.toList m1) + assert $ isNothing mThunkInfo + +#endif + +------------------------------------------------------------------------ +-- Issue #382 + +issue382 :: Assertion +issue382 = do + i :: Int <- randomIO + let k = SC (show i) + weakK <- mkWeakPtr k Nothing -- add the ability to test whether k is alive + let f :: Int -> Int -> Int + f x = error ("Should not be evaluated " ++ show x) + let m = HML.fromListWith f [(k, 1), (k, 2)] + Just v <- evaluate $ HML.lookup k m + performGC + res <- deRefWeak weakK -- gives Just if k is still alive + touch v -- makes sure that we didn't GC away the combined value + assert $ isNothing res + +------------------------------------------------------------------------ +-- Issue #383 + +#ifdef HAVE_NOTHUNKS + +-- Custom Functor to prevent interference from alterF rules +newtype MyIdentity a = MyIdentity a +instance Functor MyIdentity where + fmap f (MyIdentity x) = MyIdentity (f x) + +issue383 :: Assertion +issue383 = do + i :: Int <- randomIO + let f Nothing = MyIdentity (Just (fromIntegral @Int @Natural (abs i))) + f Just{} = MyIdentity (error "Impossible") + let (MyIdentity m) = HMS.alterF f () mempty + mThunkInfo <- noThunksInValues mempty (Foldable.toList m) + assert $ isNothing mThunkInfo + +#endif + +------------------------------------------------------------------------ +-- Issue #420 + +issue420 :: Assertion +issue420 = do + let k1 :: Int = 1 `shiftL` 10 + let k2 :: Int = 2 `shiftL` 10 + let s0 = HS.fromList [k1, k2] + let s1 = s0 `HS.intersection` s0 + assert $ k1 `HS.member` s1 + assert $ k2 `HS.member` s1 + +------------------------------------------------------------------------ +-- * Test list + +tests :: TestTree +tests = testGroup "Regression tests" + [ + testCase "issue32" issue32 + , testCase "issue39a" issue39 + , testProperty "issue39b" propEqAfterDelete + , testCase "issue254 lazy" issue254Lazy + , testCase "issue254 strict" issue254Strict + , testGroup "issue379" + [ testCase "Lazy.unionWith" issue379LazyUnionWith +#ifdef HAVE_NOTHUNKS + , testCase "union" issue379Union + , testCase "Strict.unionWith" issue379StrictUnionWith + , testCase "Strict.unionWithKey" issue379StrictUnionWithKey +#endif + ] +#ifdef HAVE_NOTHUNKS + , testGroup "issue381" + [ testCase "mapMaybe" issue381mapMaybe + , testCase "mapMaybeWithKey" issue381mapMaybeWithKey + ] +#endif + , testCase "issue382" issue382 +#ifdef HAVE_NOTHUNKS + , testCase "issue383" issue383 +#endif + , testCase "issue420" issue420 + ] diff --git a/tests/Strictness.hs b/tests/Strictness.hs new file mode 100644 index 0000000..c45bb95 --- /dev/null +++ b/tests/Strictness.hs @@ -0,0 +1,188 @@ +{-# LANGUAGE CPP #-} +{-# LANGUAGE FlexibleInstances #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} +{-# OPTIONS_GHC -fno-warn-orphans #-} + +module Strictness (tests) where + +import Control.Arrow (second) +import Control.Monad (guard) +import Data.Foldable (foldl') +import Data.Hashable (Hashable (hashWithSalt)) +import Data.HashMap.Strict (HashMap) +import Data.Maybe (fromMaybe, isJust) +import Test.ChasingBottoms.IsBottom +import Test.QuickCheck (Arbitrary (arbitrary), Property, (.&&.), + (===)) +import Test.QuickCheck.Function +import Test.QuickCheck.Poly (A) +import Test.Tasty (TestTree, testGroup) +import Test.Tasty.QuickCheck (testProperty) + +import qualified Data.HashMap.Strict as HM + +-- Key type that generates more hash collisions. +newtype Key = K { unK :: Int } + deriving (Arbitrary, Eq, Ord, Show) + +instance Hashable Key where + hashWithSalt salt k = hashWithSalt salt (unK k) `mod` 20 + +instance (Arbitrary k, Arbitrary v, Eq k, Hashable k) => + Arbitrary (HashMap k v) where + arbitrary = HM.fromList `fmap` arbitrary + +instance Show (Int -> Int) where + show _ = "" + +instance Show (Int -> Int -> Int) where + show _ = "" + +------------------------------------------------------------------------ +-- * Properties + +------------------------------------------------------------------------ +-- ** Strict module + +pSingletonKeyStrict :: Int -> Bool +pSingletonKeyStrict v = isBottom $ HM.singleton (bottom :: Key) v + +pSingletonValueStrict :: Key -> Bool +pSingletonValueStrict k = isBottom $ HM.singleton k (bottom :: Int) + +pLookupDefaultKeyStrict :: Int -> HashMap Key Int -> Bool +pLookupDefaultKeyStrict def m = isBottom $ HM.lookupDefault def bottom m + +pFindWithDefaultKeyStrict :: Int -> HashMap Key Int -> Bool +pFindWithDefaultKeyStrict def m = isBottom $ HM.findWithDefault def bottom m + +pAdjustKeyStrict :: (Int -> Int) -> HashMap Key Int -> Bool +pAdjustKeyStrict f m = isBottom $ HM.adjust f bottom m + +pAdjustValueStrict :: Key -> HashMap Key Int -> Bool +pAdjustValueStrict k m + | k `HM.member` m = isBottom $ HM.adjust (const bottom) k m + | otherwise = case HM.keys m of + [] -> True + (k':_) -> isBottom $ HM.adjust (const bottom) k' m + +pInsertKeyStrict :: Int -> HashMap Key Int -> Bool +pInsertKeyStrict v m = isBottom $ HM.insert bottom v m + +pInsertValueStrict :: Key -> HashMap Key Int -> Bool +pInsertValueStrict k m = isBottom $ HM.insert k bottom m + +pInsertWithKeyStrict :: (Int -> Int -> Int) -> Int -> HashMap Key Int -> Bool +pInsertWithKeyStrict f v m = isBottom $ HM.insertWith f bottom v m + +pInsertWithValueStrict :: (Int -> Int -> Int) -> Key -> Int -> HashMap Key Int + -> Bool +pInsertWithValueStrict f k v m + | HM.member k m = isBottom $ HM.insertWith (const2 bottom) k v m + | otherwise = isBottom $ HM.insertWith f k bottom m + +pFromListKeyStrict :: Bool +pFromListKeyStrict = isBottom $ HM.fromList [(undefined :: Key, 1 :: Int)] + +pFromListValueStrict :: Bool +pFromListValueStrict = isBottom $ HM.fromList [(K 1, undefined)] + +pFromListWithKeyStrict :: (Int -> Int -> Int) -> Bool +pFromListWithKeyStrict f = + isBottom $ HM.fromListWith f [(undefined :: Key, 1 :: Int)] + +-- The strictness properties of 'fromListWith' are not entirely +-- trivial. +-- fromListWith f kvs is strict in the first value seen for each +-- key, but potentially lazy in the rest: the combining function +-- could be lazy in the "new" value. fromListWith must, however, +-- be strict in whatever value is actually inserted into the map. +-- Getting all these properties specified efficiently seems tricky. +-- Since it's not hard, we verify that the converted HashMap has +-- no unforced values. Rather than trying to go into detail for the +-- rest, this test compares the strictness behavior of fromListWith +-- to that of insertWith. The latter should be easier to specify +-- and (if we choose to do so) test thoroughly. +-- +-- We'll fake up a representation of things that are possibly +-- bottom by using Nothing to represent bottom. The combining +-- (partial) function is represented by a "lazy total" function +-- Maybe a -> Maybe a -> Maybe a, along with a function determining +-- whether the result should be non-bottom, Maybe a -> Maybe a -> Bool, +-- indicating how the combining function should behave if neither +-- argument, just the first argument, just the second argument, +-- or both arguments are bottom. It would be quite tempting to +-- just use Maybe A -> Maybe A -> Maybe A, but that would not +-- necessarily be continous. +pFromListWithValueResultStrict :: [(Key, Maybe A)] + -> Fun (Maybe A, Maybe A) A + -> Fun (Maybe A, Maybe A) Bool + -> Property +pFromListWithValueResultStrict lst comb_lazy calc_good_raw + = all (all isJust) recovered .&&. (recovered === recover (fmap recover fake_map)) + where + recovered :: Maybe (HashMap Key (Maybe A)) + recovered = recover (fmap recover real_map) + -- What we get out of the conversion using insertWith + fake_map = foldl' (\m (k,v) -> HM.insertWith real_comb k v m) HM.empty real_list + + -- A continuous version of calc_good_raw + calc_good Nothing Nothing = cgr Nothing Nothing + calc_good Nothing y@(Just _) = cgr Nothing Nothing || cgr Nothing y + calc_good x@(Just _) Nothing = cgr Nothing Nothing || cgr x Nothing + calc_good x y = cgr Nothing Nothing || cgr Nothing y || cgr x Nothing || cgr x y + cgr = curry $ apply calc_good_raw + + -- The Maybe A -> Maybe A -> Maybe A that we're after, representing a + -- potentially less total function than comb_lazy + comb x y = apply comb_lazy (x, y) <$ guard (calc_good x y) + + -- What we get out of the conversion using fromListWith + real_map = HM.fromListWith real_comb real_list + + -- A list that may have actual bottom values in it. + real_list = map (second (fromMaybe bottom)) lst + + -- A genuinely partial function mirroring comb + real_comb x y = fromMaybe bottom $ comb (recover x) (recover y) + + recover :: a -> Maybe a + recover a = a <$ guard (not $ isBottom a) + +------------------------------------------------------------------------ +-- * Test list + +tests :: TestTree +tests = testGroup "Strictness" + [ + -- Basic interface + testGroup "HashMap.Strict" + [ testProperty "singleton is key-strict" pSingletonKeyStrict + , testProperty "singleton is value-strict" pSingletonValueStrict + , testProperty "member is key-strict" $ keyStrict HM.member + , testProperty "lookup is key-strict" $ keyStrict HM.lookup + , testProperty "lookupDefault is key-strict" pLookupDefaultKeyStrict + , testProperty "findWithDefault is key-strict" pFindWithDefaultKeyStrict + , testProperty "! is key-strict" $ keyStrict (flip (HM.!)) + , testProperty "delete is key-strict" $ keyStrict HM.delete + , testProperty "adjust is key-strict" pAdjustKeyStrict + , testProperty "adjust is value-strict" pAdjustValueStrict + , testProperty "insert is key-strict" pInsertKeyStrict + , testProperty "insert is value-strict" pInsertValueStrict + , testProperty "insertWith is key-strict" pInsertWithKeyStrict + , testProperty "insertWith is value-strict" pInsertWithValueStrict + , testProperty "fromList is key-strict" pFromListKeyStrict + , testProperty "fromList is value-strict" pFromListValueStrict + , testProperty "fromListWith is key-strict" pFromListWithKeyStrict + , testProperty "fromListWith is value-strict" pFromListWithValueResultStrict + ] + ] + +------------------------------------------------------------------------ +-- * Utilities + +keyStrict :: (Key -> HashMap Key Int -> a) -> HashMap Key Int -> Bool +keyStrict f m = isBottom $ f bottom m + +const2 :: a -> b -> c -> a +const2 x _ _ = x diff --git a/unordered-containers.cabal b/unordered-containers.cabal new file mode 100644 index 0000000..ed265b1 --- /dev/null +++ b/unordered-containers.cabal @@ -0,0 +1,144 @@ +name: unordered-containers +version: 0.2.19.1 +synopsis: Efficient hashing-based container types +description: + Efficient hashing-based container types. The containers have been + optimized for performance critical use, both in terms of large data + quantities and high speed. + . + The declared cost of each operation is either worst-case or + amortized, but remains valid even if structures are shared. + . + /Security/ + . + This package currently provides no defenses against hash collision attacks + such as HashDoS. + Users who need to store input from untrusted sources are advised to use + @Data.Map@ or @Data.Set@ from the @containers@ package instead. +license: BSD3 +license-file: LICENSE +author: Johan Tibell +maintainer: simon.jakobi@gmail.com, David.Feuer@gmail.com +Homepage: https://github.com/haskell-unordered-containers/unordered-containers +bug-reports: https://github.com/haskell-unordered-containers/unordered-containers/issues +copyright: 2010-2014 Johan Tibell + 2010 Edward Z. Yang +category: Data +build-type: Simple +cabal-version: >=1.10 +extra-source-files: CHANGES.md + +tested-with: + GHC ==9.2.1 + || ==9.0.2 + || ==8.10.7 + || ==8.8.4 + || ==8.6.5 + || ==8.4.4 + || ==8.2.2 + +flag debug + description: Enable debug support + default: False + +library + exposed-modules: + Data.HashMap.Internal + Data.HashMap.Internal.Array + Data.HashMap.Internal.List + Data.HashMap.Internal.Strict + Data.HashMap.Lazy + Data.HashMap.Strict + Data.HashSet + Data.HashSet.Internal + + build-depends: + base >= 4.10 && < 5, + deepseq >= 1.4.3, + hashable >= 1.2.5 && < 1.5, + template-haskell < 2.19 + + default-language: Haskell2010 + + other-extensions: + RoleAnnotations, + UnboxedTuples, + ScopedTypeVariables, + MagicHash, + BangPatterns + + ghc-options: -Wall -O2 -fwarn-tabs -ferror-spans + + -- For dumping the generated code: + -- ghc-options: -ddump-simpl -ddump-stg-final -ddump-cmm -ddump-asm -ddump-to-file + -- ghc-options: -dsuppress-coercions -dsuppress-unfoldings -dsuppress-module-prefixes + -- ghc-options: -dsuppress-uniques -dsuppress-timestamps + + if flag(debug) + cpp-options: -DASSERTS + +test-suite unordered-containers-tests + hs-source-dirs: tests + main-is: Main.hs + type: exitcode-stdio-1.0 + other-modules: + Regressions + Properties + Properties.HashMapLazy + Properties.HashMapStrict + Properties.HashSet + Properties.List + Strictness + + build-depends: + base, + ChasingBottoms, + containers >= 0.5.8, + hashable, + HUnit, + QuickCheck >= 2.4.0.1, + random, + tasty >= 1.4.0.3, + tasty-hunit >= 0.10.0.3, + tasty-quickcheck >= 0.10.1.2, + unordered-containers + + if impl(ghc >= 8.6) + build-depends: + nothunks >= 0.1.3 + + default-language: Haskell2010 + ghc-options: -Wall + cpp-options: -DASSERTS + +benchmark benchmarks + hs-source-dirs: benchmarks + main-is: Benchmarks.hs + type: exitcode-stdio-1.0 + + other-modules: + Util.ByteString + Util.String + Util.Int + + build-depends: + base >= 4.8.0, + bytestring >= 0.10.0.0, + containers, + deepseq, + hashable, + hashmap, + mtl, + random, + tasty-bench >= 0.3.1, + unordered-containers + + default-language: Haskell2010 + ghc-options: -Wall -O2 -rtsopts -with-rtsopts=-A32m + if impl(ghc >= 8.10) + ghc-options: "-with-rtsopts=-A32m --nonmoving-gc" + -- cpp-options: -DBENCH_containers_Map -DBENCH_containers_IntMap -DBENCH_hashmap_Map + +source-repository head + type: git + location: https://github.com/haskell-unordered-containers/unordered-containers.git