The Data-Platform team currently makes use of Miniconda to provide a number of Anaconda based environments.
Anaconda has announced that the licensing for their software will be changing in such a way that non-profit and academic institutions will have to start paying for their software.
c.f. https://www.theregister.com/2024/08/08/anaconda_puts_the_squeeze_on/
We need to migrate from Miniconda to Miniforge, before this becomes a problem.
Miniconda is installed manually in these repositories:
- conda-analytics - https://gitlab.wikimedia.org/repos/data-engineering/conda-analytics/-/merge_requests/52
- airflow - https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/829
We also have a number of pipelines defined in WMF Data Workflow Utils that build on:
- conda_setup_script - https://gitlab.wikimedia.org/repos/data-engineering/workflow_utils/-/merge_requests/46
These pipelines have been used in the following downstream projects:
- example-job-project - https://gitlab.wikimedia.org/repos/data-engineering/example-job-project/-/merge_requests/36
- mediawiki-content-dump - https://gitlab.wikimedia.org/repos/data-engineering/dumps/mediawiki-content-dump/-/merge_requests/45
-
gdi-jobsArchived project. - research/knowledge gaps
- research/article-quality
- research/research-datasets
- search-platform/discolytics
- search-platform/mjolnir
- structured-data/image-suggestions
- structured-data/section-topics
- structured-data/section-image-recs # Also uses mamba forge.
- structured-data/seal # Also uses mamba-forge
- security/differential-privacy
- product-analytics/moderation-mariadb-jobs # Several versions
- product-analytics/automoderator-metrics-jobs # Several versions
This list was ascertained by looking at the artifacts.yaml files for each of the airflow instances.