Skip to content

v0.9.0

Latest

Choose a tag to compare

@github-actionsgithub-actions released this 24 Apr 01:23
· 18 commits to main since this release

What's Changed

  • chore: refresh uv.lock to match project 0.8.0
  • perf(prim): add dense symmetric core-distance pass
  • perf(prim): replace dense MRD matrix with streaming Prim
  • perf(hdbscan): widen Prim byte budget to 512 MiB and cap dispatch at boruvkaDimCeil
  • perf(boruvka): hoist scratch, prune self-component subtrees, kNN-seed bound
  • perf(nn-descent): CSR-pack and parallelise JoinStep prelude
  • perf(hdbscan): vectorize and parallelize nn-descent disconnect fallback
  • feat(hdbscan): expose backend parameter on the Python binding
  • perf(hdbscan): dispatch Prim by byte budget and Boruvka by low-dim ceiling
  • perf(hdbscan): vectorize pointAabbGapSq via shared math kernel
  • chore(uv): sync lockfile with current project version
  • test(pybench): raise dbscan ari gate to 0.98
  • test(bench): add nn_descent build benchmark
  • test(pybench): scale vMF kappa with dim in the hdbscan recipe
  • fix(hdbscan): size nn-descent kExtra to Dong 2011's recall target
  • perf(nn_descent): drop chunk-wide cand buffer, shard bank mutex by target
  • perf(nn_descent): gate heap-push dedup scan on admission eligibility
  • feat(hdbscan): add min_samples convention flag for sklearn parity
  • bump: 0.7.8 -> 0.8.0
  • fix(hdbscan): exclude root cluster from EOM selection
  • docs(hdbscan): note Campello-Euclidean lambda scale and thread-safety contract
  • perf(pybench): wire hdbscan recipe against sklearn HDBSCAN
  • feat(hdbscan): expose Python binding via nanobind
  • feat(hdbscan): auto-dispatch MST backend as HDBSCAN default
  • feat(hdbscan): post-MST pipeline and end-to-end run wiring
  • feat(hdbscan): add NN-Descent approximate MST backend
  • feat(hdbscan): add KDTree-accelerated Boruvka MST backend
  • feat(hdbscan): add dense exact Prim MST backend
  • feat(index): add NnDescentIndex approximate kNN graph primitive
  • feat(index): kNN query and per-node AABB bounds on KDTree
  • feat(hdbscan): foundation types, concept, and class shell
  • bump: 0.7.7 -> 0.7.8
  • perf(dbscan): improve dbscan threading
  • bump: 0.7.6 -> 0.7.7
  • perf(dbscan): upper-triangle prune in brute-force range query
  • bump: 0.7.5 -> 0.7.6
  • perf(dbscan): reorder points + d=2 SIMD leaf scan, fit/run API
  • docs(readme): refresh benchmark image
  • bump: 0.7.4 -> 0.7.5
  • test(bench): Elkan-eligible k-means shapes
  • fix(kmeans): size seeder GEMM arena per worker
  • perf(kmeans): Elkan bounds for k > 64
  • perf(kmeans): Hamerly Lemma-1 shortcut
  • refactor(kmeans): drop dead seedHamerlyBounds
  • perf(kmeans): work-gated pool dispatch + lazy spawn
  • perf(kmeans): split SoA kmpp FMA chain
  • test(bench): pin k-means regression shapes
  • bump: 0.7.3 -> 0.7.4
  • fix(pybench): align datasets at the source so memray peaks are stable
  • fix(kmeans): extract nested ternaries in seeder ensureShape
  • perf(kmeans): parallel-safe SoA kmpp scoring
  • fix(kmeans): skip GEMM scoring scratch when the seeder won't use it
  • bump: 0.7.2 -> 0.7.3
  • perf(kmeans): SoA kmpp scoring, Hamerly tail re-assign, zero-copy X borrow
  • bump: 0.7.1 -> 0.7.2
  • chore: sync uv.lock project version to 0.7.0
  • chore: gitignore perf_out/
  • chore: gitignore benchmark_results*/ and untrack the _ref snapshot
  • fix(pybench): label memory axis when every facet ratio sits below 1x
  • perf(kmeans): Hamerly pruning and SIMD column reducers
  • bump: 0.7.0 -> 0.7.1
  • fix(kmeans): satisfy clang-tidy on header-scope tidy checks
  • test(pybench): consolidate redundant parameterized duplicates
  • fix(pybench): scope threadpool_limits around every theirs_fn
  • perf(kmeans): SIMD packA + cached xNorms + SIMD recomputeMinDistSq
  • perf(kmeans): unblock d>kKc and widen parallel fan-out
  • bump: 0.6.0 -> 0.7.0
  • feat: better plots
  • bump: 0.5.0 -> 0.6.0
  • fix(app): wrap main in a function-try-block for tidy exception-escape
  • perf(math): swap M-tile/panel loop nest and pre-pack A per chunk
  • perf(math): hoist pack-A scratch out of the tile loop and drop zero-init
  • perf(math): break AVX2 threshold kernel FMA chains and vectorize the emit walk
  • feat(dbscan): exact high-dimensional path via fused pairwise kernel
  • bump: 0.4.0 -> 0.5.0
  • fix(kmeans): drop noexcept on AutoSeeder::ensureShape
  • refactor: use plain std algorithms instead of std::ranges
  • refactor(kmeans): drop AfkMc2Seeder's internal greedy fallback
  • feat(kmeans): extract Lloyd and seeders as concept-constrained policies
  • feat(dbscan): constrain QueryModel on a RangeQuery concept
  • bump: 0.3.1 -> 0.4.0
  • fix(recipes): keep KMeans on isotropic blobs at every dim
  • chore: relax kmeans grid
  • fix(pybench): scope sklearn threadpool limit to every pool, not only BLAS
  • feat(pybench): vMF fixtures above 16D, knee-selected eps, matched BLAS threads
  • feat(runner): measure peak memory via memray instead of subprocess fork
  • fix(charts): label lines with n_jobs, split meta across two lines, bump DPI
  • fix(charts): gate non-finite speedup, broaden git catch, surface legacy-JSON warning
  • feat(cli): cut over to new chart pipeline and add --replot
  • feat(charts): add gates and content-addressed chart filenames
  • feat(charts): wrap results.json with run metadata
  • feat(charts): add Figure builder with 2xD grid
  • feat(charts): add pure data layer for partition, ratio, hash
  • feat(recipes): unify default_dims across dbscan and kmeans
  • bump: 0.3.0 -> 0.3.1
  • chore: remove clangd index from tracking, add .cache/ to gitignore
  • fix(ci): push tag explicitly in release workflow
  • bump: version 0.2.0 → 0.3.0
  • ci: auto-bump versions in README via commitizen version_files
  • ci: merge build and tidy into a single clang+tidy job
  • build: add release workflow and commitizen version_files
  • docs: update README with kmeans examples and benchmark commands
  • fix: gcc add -Wno-ignored-attributes
  • feat(pybench): add kmeans recipe with n_jobs grid matching dbscan
  • refactor(pybench): replace standalone kmeans script with proper recipe
  • refactor(math): extract sqEuclideanRowPtr + horizontalSum into shared avx2_helpers.h
  • refactor(math): remove dead code from kmeans math primitives
  • fix(kmeans): translate invalid-input aborts to python exceptions + defensive assert
  • perf(kmeans): 16-wide transposed seeder kernel + corner benchmark fixtures
  • perf(kmeans): skip thread-pool construction and dispatch at nJobs == 1
  • perf(kmeans): direct argmin path at d <= 8 skips packA/packB overhead
  • perf(kmeans): cache per-(point, candidate) distances during seeder scoring
  • perf(kmeans): extend transposed seeder kernel to d == kAvx2Lanes
  • perf(kmeans): transposed (d, 8) candidate kernel for low-d seeder scoring
  • perf(kmeans): binary-search inverse CDF in greedy kmpp candidate sampling
  • perf(kmeans): switch greedy kmpp local-trials count to sklearn's 2 + ln(k)
  • perf(math): widen pairwiseArgminMaxD to 64 to keep d=32 on the fused path
  • perf(kmeans): drop in-loop minDistSq recompute, keep only the final pass
  • fix(kmeans): recompute minDistSq directly to avoid f32 cancellation
  • build(kmeans): add C++ benchmark, Python binding, and pybench harness
  • perf(kmeans): chunk materialized argmin + batch greedy-kmpp scoring
  • feat(kmeans): AFK-MC2 seeder with k<100 fallback to greedy kmpp
  • feat(kmeans): KMeans class with fused Lloyd + greedy k-means++
  • feat(math): fused argmin-GEMM, accumulateByLabel, centroidShift + NDArray arithmetic allowlist
  • feat(math): scalar cosine and manhattan via pointwiseSq + ADL extensibility test
  • refactor(kdtree): delegate distance computation to math::distance CPO
  • feat(math): AVX2 SqEuclidean specialization behind pointwiseSq
  • feat(math): add pointwiseSq CPO with scalar SqEuclidean default
  • refactor(math): promote pairwise GEMM threshold to defaults::pairwiseGemmThreshold
  • test(math): add k-means assign integration smoke for pairwise dispatch
  • feat(math): dispatch pairwiseSqEuclidean on nmd threshold
  • feat(math): add detail::pairwiseSqEuclideanGemm GEMM-identity path + rowNormsSq
  • feat(math): add pairwise.h with pairwiseSqEuclidean SIMD-per-pair path
  • style: docs cleanup
  • feat(bench): pin GEMM benchmark to CCD0 and add Square_1023
  • feat(math): add heap.h with BinaryHeap and IndexedHeap + decreaseKey
  • feat(math): add dsu.h with UnionFind, iterative compression, union-by-rank
  • feat(math): add reduce.h with sum/Kahan/Welford/argmin/argmax/topk
  • feat(math): add rng.h with pcg64, xoshiro256ss, weighted sampling primitives
  • feat(math): add gemm benchmark harness with optional OpenBLAS gate
  • feat(math): parallelize GEMM Mc-tile loop with per-worker arenas
  • feat(math): add public gemm entry with Backend dispatch and mutability assert
  • feat(math): add GemmPlan with pool-owned arena and pre-packed B
  • feat(math): add AVX2 f32 8x6 GEMM microkernel
  • feat(math): add scalar reference GEMM with Goto-style outer loop
  • feat(math): add Pool wrapper for thread-pool injection
  • build: update thread pool dep
  • refactor: fix clang-tidy
  • refactor: move all public types under clustering:: namespace
  • fix(ndarray): close borrow contract gaps and tighten sameStorage
  • feat(math): arrayEqual and allClose free functions
  • feat(python): nanobind zero-copy adapter for NDArray
  • feat(math): MatrixDesc descriptor + describeMatrix extractors
  • feat(ndarray): borrow factories, alignedData, write-asserts
  • feat(ndarray): view/reshape/contiguous/clone verbs
  • feat(ndarray): borrowed view verbs t/row/col/slice/permute + sameStorage
  • feat(ndarray): add Layout template tag, variadic operator()
  • feat(ndarray): extend storage model with shape, strides, aligned allocator
  • feat: add GoogleTest infra and ctest integration
  • build: scope clang-tidy to own targets only, not CPM deps
  • fix(ci): use clang++-18 driver so C++ link pulls in libstdc++
  • build: add commitizen and bump pre-commit hook versions
  • Add C++ tooling: clang-format, clang-tidy, pre-commit, CI
  • Replace hopscotch_set with epoch bitmap; auto-enable AVX2 (v0.3.0)
  • Update README: tag v0.2.0, new benchmark commands
  • Optimize KDTree and DBSCAN performance (v0.2.0)
  • Add Python bindings and benchmark framework
  • typo
  • fix docs
  • typo
  • Init
    Full diff: v0.8.0...v0.9.0