S StreamifyЛокальная аналитика Яндекс Музыки

Личная аналитика Яндекс Музыки на вашем ноутбуке.

Raw, staging, marts и dashboard data flow. Метаданные остаются локально: ingestion, DuckDB/dbt, dashboard, отчеты, action queues и воспроизводимая документация.

Жанровый сдвиг
Artist gravity 3.4x
Playlist overlap 0.28
Monthly rhythm

Yandex Music Local Lineage

This catalog documents the local metadata-only data path. Streamify does not download, store, transform, or play audio.

Layer Map

LayerArtifactPurpose
Raw/Bronzedata/raw/yamusic/tracks.jsonlTrack metadata, album fields, artist arrays, liked flag, source and ingestion timestamp.
Raw/Bronzedata/raw/yamusic/artists.jsonlArtist metadata discovered from tracks and account-visible liked artists.
Raw/Bronzedata/raw/yamusic/albums.jsonlAlbum metadata discovered from tracks and account-visible liked albums.
Raw/Bronzedata/raw/yamusic/playlists.jsonlOwned playlist metadata plus account-visible liked playlists and declared track counts where exposed by Yandex Music.
Raw/Bronzedata/raw/yamusic/playlist_tracks.jsonlPlaylist-track membership and positions.
Raw/Bronzedata/raw/yamusic/user_library_events.jsonlDerived metadata events for liked tracks and playlist membership.
Raw/Bronzedata/raw/yamusic/_manifest.jsonSource, generated timestamp, adapter/client metadata, diagnostics counters, output paths, row counts and JSONL checksums. It must not contain token material.
Silverstg_yamusic_manifestParsed ingestion manifest with source, generated timestamp, JSON-only flag, adapter/client metadata, diagnostics counters and raw row counts.
Silverstg_yamusic_*Typed DuckDB reads, dedupe, null normalization and relationship-ready keys.
Goldyamusic_dim_*, yamusic_fact_*, yamusic_*_profile, yamusic_*_signalsPractical marts for self-analytics and dashboard views.
Appdashboard/app.pyStreamlit interface over data/streamify.duckdb.
Reportdata/streamify_summary.mdStatic answer-first self-analytics summary exported from the same DuckDB marts.
Snapshotdata/streamify_snapshot.jsonSchema-versioned JSON self-analytics snapshot for automation, CI artifacts and downstream agent workflows.
Recommendationsdata/recommendations/*.csvSpreadsheet-friendly action queues for rediscovery, playlist cleanup, standout playlists, top artists and genre shifts.

Lineage

flowchart LR
  ingest["yamusic_ingest\nmetadata only"] --> raw_tracks["tracks.jsonl"]
  ingest --> raw_artists["artists.jsonl"]
  ingest --> raw_albums["albums.jsonl"]
  ingest --> raw_playlists["playlists.jsonl"]
  ingest --> raw_playlist_tracks["playlist_tracks.jsonl"]
  ingest --> raw_events["user_library_events.jsonl"]
  ingest --> raw_manifest["_manifest.json"]

  raw_tracks --> stg_tracks["stg_yamusic_tracks"]
  raw_artists --> stg_artists["stg_yamusic_artists"]
  raw_albums --> stg_albums["stg_yamusic_albums"]
  raw_playlists --> stg_playlists["stg_yamusic_playlists"]
  raw_playlist_tracks --> stg_playlist_tracks["stg_yamusic_playlist_tracks"]
  raw_events --> stg_events["stg_yamusic_user_library_events"]

  stg_tracks --> dim_tracks["yamusic_dim_tracks"]
  stg_artists --> dim_artists["yamusic_dim_artists"]
  stg_albums --> dim_albums["yamusic_dim_albums"]
  stg_playlists --> dim_playlists["yamusic_dim_playlists"]
  stg_playlist_tracks --> fact_playlist_tracks["yamusic_fact_playlist_tracks"]
  stg_events --> fact_events["yamusic_fact_library_events"]

  dim_tracks --> artist_affinity["yamusic_artist_affinity"]
  fact_playlist_tracks --> artist_affinity
  dim_tracks --> genre_profile["yamusic_genre_profile"]
  dim_tracks --> track_signals["yamusic_track_signals"]
  fact_playlist_tracks --> track_signals
  fact_events --> track_signals
  fact_events --> period_activity["yamusic_period_activity"]
  dim_tracks --> period_activity
  fact_events --> genre_periods["yamusic_genre_periods"]
  dim_tracks --> genre_periods
  fact_playlist_tracks --> playlist_overlap["yamusic_playlist_overlap"]
  dim_playlists --> playlist_overlap
  dim_playlists --> playlist_signals["yamusic_playlist_signals"]
  playlist_overlap --> playlist_signals

  artist_affinity --> library_profile["yamusic_library_profile"]
  genre_profile --> library_profile
  track_signals --> library_profile
  period_activity --> library_profile
  playlist_signals --> library_profile
  dim_tracks --> library_profile
  dim_playlists --> library_profile
  fact_events --> library_profile

  library_profile --> dashboard["Streamlit dashboard"]
  library_profile --> report["Markdown summary"]
  library_profile --> snapshot["JSON snapshot"]
  library_profile --> recommendations["Recommendation CSVs"]
  dim_tracks --> dashboard
  dim_tracks --> snapshot
  artist_affinity --> dashboard
  artist_affinity --> report
  artist_affinity --> snapshot
  artist_affinity --> recommendations
  period_activity --> dashboard
  period_activity --> snapshot
  genre_periods --> dashboard
  genre_periods --> report
  genre_periods --> snapshot
  genre_periods --> recommendations
  genre_profile --> dashboard
  genre_profile --> snapshot
  playlist_overlap --> dashboard
  playlist_overlap --> snapshot
  playlist_overlap --> recommendations
  playlist_signals --> dashboard
  playlist_signals --> report
  playlist_signals --> snapshot
  playlist_signals --> recommendations
  track_signals --> dashboard
  track_signals --> report
  track_signals --> snapshot
  track_signals --> recommendations

Product Questions

Product questionPrimary modelSupporting models
Favorite artistsyamusic_artist_affinityyamusic_dim_tracks, yamusic_fact_playlist_tracks
Favorite tracksyamusic_dim_tracksyamusic_track_signals
Genre shiftsyamusic_genre_periodsyamusic_period_activity, yamusic_genre_profile
Repeatsyamusic_track_signals.repeat_signalyamusic_fact_library_events, yamusic_fact_playlist_tracks
Diversityyamusic_genre_profile, yamusic_library_profileyamusic_artist_affinity
Active periodsyamusic_period_activityyamusic_fact_library_events
Underrated tracksyamusic_track_signals.underrated_flagyamusic_dim_tracks
Underrated playlistsyamusic_playlist_signals.underrated_playlist_flagyamusic_playlist_overlap, yamusic_dim_playlists
Data freshnessyamusic_library_profile.stale_ingestion_flagyamusic_fact_library_events

Quality Gates

GateCommandWhat it proves
Raw contractmake raw-contractRequired raw JSONL fields, basic types, source values, event types, manifest row counts, JSONL sha256 checksums, ingestion diagnostics consistency, unique IDs and playlist/event referential integrity.
dbt buildmake dbt-builddbt packages resolve, then DuckDB marts compile, build and pass schema/relationship tests.
Local doctormake doctorRequired raw files, manifest, mart tables and one-row profile exist, with DuckDB manifest source/raw counts matching the latest raw files.
Dashboard smokemake dashboard-smokeStreamlit starts against the local DuckDB file and returns HTTP 200.
Report exportmake reportStatic markdown summary and JSON snapshot can be generated from the local DuckDB marts.
Snapshot exportmake snapshotSchema-versioned JSON self-analytics snapshot can be generated independently for automation and agent workflows.
Recommendations exportmake recommendationsSpreadsheet-friendly CSV queues can be generated independently for top artists, rediscovery tracks, playlist cleanup, standout playlists and genre shifts.
Product answersmake product-answers-smokeFavorite artists/tracks, repeats, genre shifts, diversity, active periods, playlist overlap, underrated signals, source provenance and data-quality profile are queryable from marts/report/snapshot.
Readiness auditmake readinessCurrent raw counts, DuckDB profile, report existence, no-audio invariant, sample-vs-real source status and stale-dbt protection are verified.
Compose smokemake compose-smoke-localDocker Compose local profile builds, ingests sample metadata, validates raw contract, builds marts, runs doctor, exports the report, runs readiness, and serves dashboard HTTP 200.
Full local gatemake testStatic validators, safety guards, empty-account smoke, sample acceptance, Python tests and Docker Compose smoke.
Real account gatemake acceptance-realReal token preflight, real metadata ingestion, raw contract, dbt deps/build, doctor, report, readiness-real source enforcement and dashboard smoke.