santhosh (Santhosh Thottingal)
Principal Software Engineer, Language Engineering.

Projects (16)
View All

ContentTranslation
Component
CX-out-of-beta
Component
ExternalGuidance
Component
Hindi-Sites
Tag
I18n
Tag

Calendar

User Details

User Since: Oct 7 2014, 2:57 AM (526 w, 5 d)
Availability: Available
LDAP User: Santhosh
MediaWiki User: Sthottingal-WMF [ Global Accounts ]

https://wikimediafoundation.org/wiki/User:Sthottingal-WMF

Maintains and engineer for ContentTranslation UniversalLanguageSelector and general MediaWiki-Internationalization

Recent Activity
View All

Thu, Oct 31

santhosh added a comment to T309772: npm audit reports several security issues with Service runner.

CXServer removed the dependency on service-runner T357950: Remove servicerunner dependency for cxserver and deployed to production last week. Since service-runner and the associated service template had huge influence on how a nodejs servie is written, it was not an easy migration. This was also partly due to the fact that cxserver was written in 2015 and then grew to a complex system.

Thu, Oct 31, 5:17 AM · Vuln-VulnComponent, MediaWiki-Engineering, Security, service-runner

Wed, Oct 30

santhosh created T378562: Remove v1 routes.

Wed, Oct 30, 6:34 AM · CX-cxserver

santhosh added a comment to T375616: Block RESTBase cxserver v1 endpoints in favor of the new endpoints.

Looks good to me

Wed, Oct 30, 5:52 AM · Essential-Work, Traffic, CX-cxserver, RESTBase Sunsetting

Mon, Oct 28

santhosh added a comment to T377966: cxserver: Logstash entries seems difficult to read.

It looks like the new lines in ECS Json data is treated as seperate log. And the log.level is interpreted as NOTICE when the value is "error". I cann't tell whether this is related to containerd migration. If new lines in messages are not allowed in ECS log messages, we can fix it in cxserver side(stacktraces if present in log message, will have new line). But it would be nice if there is a way to validate without trying to deploy a change and see what happens.

Mon, Oct 28, 10:30 AM · LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver

santhosh created T378326: MinT for Wikipedia Readers: Reduce parallel MT requests on page load.

Mon, Oct 28, 7:38 AM · MinT

santhosh added a comment to T376865: MinT for Wikipedia Readers: Large sections fails to translate.

Since we prioritize user experience, and sending a large chunk will have proportional delay in response time, cxserver accepting larger chunk will not help users. The clients need to send smaller chunks of content in sequential batches.

Mon, Oct 28, 5:14 AM · MinT

Tue, Oct 22

santhosh created T377821: Remove nllb-wikipedia model and replace with nllb200-600m.

Tue, Oct 22, 10:46 AM · MinT

santhosh created T377813: Migrate cxserver code CommonJS to ESM / ECMAScript.

Tue, Oct 22, 10:27 AM · Technical-Debt, CX-cxserver

santhosh added a comment to T357950: Remove servicerunner dependency for cxserver.

status: cxserver is deployed in production. A few metrics reporting related issues were noticed(some metrics missing) and currently being fixed. Other than that, all APIs are functional.

Tue, Oct 22, 10:09 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

Thu, Oct 17

santhosh added a comment to T357950: Remove servicerunner dependency for cxserver.

@akosiaris We deployed this code in staging. Only issue we observe is ECS logging is not parsed by logstash.

Thu, Oct 17, 9:18 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

santhosh added a comment to T357950: Remove servicerunner dependency for cxserver.

Current status:

Thu, Oct 17, 7:08 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

Wed, Oct 16

santhosh added a comment to T357950: Remove servicerunner dependency for cxserver.

Observations from staging deployment:

Wed, Oct 16, 7:52 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

santhosh moved T368521: MinT leaks template metadata in translated content from In Progress to Done on the LPL Essential (LPL Essential 2024 Jul-Oct) board.

The Google translation issue seems a valid, but it is totally unrelated to the original one. Google translation and markup fixes are handles by CX-cxserver project and MinT has no code related to Google.

Wed, Oct 16, 4:48 AM · LPL Essential (LPL Essential 2024 Jul-Oct), MinT, ContentTranslation

santhosh created T377305: Corrupted HTML with Google translation in en to ig translation.

Wed, Oct 16, 4:46 AM · CX-cxserver

Tue, Oct 15

santhosh added a comment to T369595: Custom translation suggestions: Search.

In the design, 'Search for a topic' is about searching a topic like "Arts", "Maths", "Geography" (as in https://www.mediawiki.org/wiki/ORES/Articletopic)?
Should the user type an article title like London_Fashion_Week. From the example "Cubism" I assume it is about searching an article and then use that as a seed for suggestions.

Tue, Oct 15, 8:28 AM · Patch-For-Review, LPL Hypothesis, CX-boost, OKR-Work

Oct 10 2024

santhosh updated the task description for T376863: MinT for Wikipedia Readers: Language search does not suppor language name based search.

Oct 10 2024, 6:18 AM · MinT

santhosh created T376866: MinT for Wikipedia Readers: No error handling when a translation fails.

Oct 10 2024, 6:14 AM · MinT

santhosh created T376865: MinT for Wikipedia Readers: Large sections fails to translate.

Oct 10 2024, 6:12 AM · MinT

santhosh created T376864: MinT for Wikipedia Readers: Language and domain handling issues.

Oct 10 2024, 6:07 AM · MinT

santhosh created T376863: MinT for Wikipedia Readers: Language search does not suppor language name based search.

Oct 10 2024, 6:00 AM · MinT

santhosh created T376862: MinT for Wikipedia Readers: Unadapted links are shown as blue links.

Oct 10 2024, 5:52 AM · MinT

santhosh added a project to T376861: MinT for Wikipedia Readers: Images fail to load in mediaviewer: MinT.

Oct 10 2024, 5:49 AM · MinT

santhosh created T376861: MinT for Wikipedia Readers: Images fail to load in mediaviewer.

Oct 10 2024, 5:49 AM · MinT

santhosh added a comment to T376860: MinT for Wikipedia Readers: All references are missing.

Following js error is noted from console:

jquery.js:3783 jQuery.Deferred exception: Cannot read properties of undefined (reading '$el') TypeError: Cannot read properties of undefined (reading '$el')
    at Object.displayDrawer (https://en.m.wikipedia.org/w/load.php?lang=en&modules=skins.minerva.scripts&skin=minerva&version=emw6f:13:428)
    at eval (https://en.m.wikipedia.org/w/load.php?lang=en&modules=skins.minerva.scripts&skin=minerva&version=emw6f:46:886)
    at mightThrow (https://en.m.wikipedia.org/w/load.php?lang=en&modules=%40wikimedia%2Fcodex%…-styles%2Cjquery%2Cvue%7Cmobile.startup&skin=minerva&version=1evph:193:648)
    at process (https://en.m.wikipedia.org/w/load.php?lang=en&modules=%40wikimedia%2Fcodex%…-styles%2Cjquery%2Cvue%7Cmobile.startup&skin=minerva&version=1evph:194:309) undefined

Oct 10 2024, 5:44 AM · MinT

santhosh created T376860: MinT for Wikipedia Readers: All references are missing.

Oct 10 2024, 5:39 AM · MinT

santhosh created T376859: Styling missing for infoboxes and navboxes, disambiguation and such meta elements.

Oct 10 2024, 5:36 AM · MinT

santhosh updated the task description for T376858: Broken layout for Special:AutomaticTranslation.

Oct 10 2024, 5:34 AM · MinT

santhosh created T376858: Broken layout for Special:AutomaticTranslation.

Oct 10 2024, 5:33 AM · MinT

santhosh created T376857: MinT for Wikipedia Readers: Hardcoded cxserver API URL.

Oct 10 2024, 5:29 AM · MinT

Oct 9 2024

santhosh added a comment to T368521: MinT leaks template metadata in translated content.

Trying to debug the issue. It seems the issue is happening with nllb-wikipedia(default) model and not with nllb-600m model.

Oct 9 2024, 7:23 AM · LPL Essential (LPL Essential 2024 Jul-Oct), MinT, ContentTranslation

Oct 8 2024

santhosh added a comment to T375570: The UserGetLanguageObject hook handler for UniversalLanguageSelector disables skin logo variants on Chinese Wikipedia.

The UserGetLanguageObject hook hander returns a user language. It uses browsers AcceptLanguage settings for this purpose unless there is a language cookie or there is a uselang override in URL.

Oct 8 2024, 7:38 AM · MediaWiki-Language-converter, Chinese-Sites, UniversalLanguageSelector

santhosh added a comment to T376585: Access to deploy recommendation API ML service for kartik.

@isarantopoulos Agreed, let us recheck after two weeks. From our team perspective, we expect frequent deployments in this quarter as recommendation API is basis for two of our KRs.

Oct 8 2024, 6:14 AM · Machine-Learning-Team, SRE, LPL Essential (LPL Essential 2024 Jul-Oct), SRE-Access-Requests

Oct 7 2024

santhosh updated the task description for T357950: Remove servicerunner dependency for cxserver.

Oct 7 2024, 9:21 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

santhosh updated the task description for T357950: Remove servicerunner dependency for cxserver.

Oct 7 2024, 9:20 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

santhosh updated the task description for T357950: Remove servicerunner dependency for cxserver.

Oct 7 2024, 9:20 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

santhosh renamed T376585: Access to deploy recommendation API ML service for kartik from Aceess to deploy recommendation API ML service for kartik to Access to deploy recommendation API ML service for kartik.

Oct 7 2024, 8:12 AM · Machine-Learning-Team, SRE, LPL Essential (LPL Essential 2024 Jul-Oct), SRE-Access-Requests

santhosh updated subscribers of T376585: Access to deploy recommendation API ML service for kartik.

@isarantopoulos As discussed, @KartikMistry will be deploying recommendation API for LPL team. If he can get access to deployment, we can avoid dependency on ML team for frequent deployments.

Oct 7 2024, 6:46 AM · Machine-Learning-Team, SRE, LPL Essential (LPL Essential 2024 Jul-Oct), SRE-Access-Requests

Sep 29 2024

santhosh added a comment to T370755: Caching service request for MinT.

@akosiaris Horizontal scaling is required, we will reach out for that separatly. Very soon.

Sep 29 2024, 1:54 PM · Community-Tech (Jackal (not a fox) Fox), serviceops, LPL Essential, Community Wishlist (Translations), MinT

Sep 23 2024

santhosh added a comment to T363308: Server-side caching for MinT.

@GMikesell-WMF It is a backend feature and verifiable by developers. You may skip it. We had verified it.

Sep 23 2024, 4:52 AM · Community-Tech (Island Fox (Sept 9 - 20)), LPL Essential (LPL Essential 2024 Jul-Oct), Community Wishlist (Translations), MinT

Sep 18 2024

santhosh added a comment to T370755: Caching service request for MinT.

Hi @jijiki , any updates on this request? thanks.

Sep 18 2024, 8:26 AM · Community-Tech (Jackal (not a fox) Fox), serviceops, LPL Essential, Community Wishlist (Translations), MinT

Sep 9 2024

santhosh merged T374290: Missing languages popping in ULS make users sometimes miss their selection into T344028: The "missing relevant languages banner" with slow loading makes the interwiki links jump down.

Sep 9 2024, 4:12 AM · LPL Hypothesis, ContentTranslation, UniversalLanguageSelector

santhosh merged task T374290: Missing languages popping in ULS make users sometimes miss their selection into T344028: The "missing relevant languages banner" with slow loading makes the interwiki links jump down.

Sep 9 2024, 4:10 AM · UniversalLanguageSelector, ULS-CompactLinks

Aug 29 2024

santhosh added a comment to T368422: Custom translation suggestions: Basic topic selection.

@Isaac to make this taxonomy localizable, we need to decide a single source of truth. Do you think ORES is a good candidate? @SBisson prepared a POC patch with ORES.

Aug 29 2024, 10:51 AM · MW-1.43-notes (1.43.0-wmf.26; 2024-10-08), LPL Hypothesis, CX-boost

santhosh added a comment to T372753: Switchover plan from RESTBase to REST Gateway for cxserver.

Machine-Translate Content Endpoint

HTTP Verb: POST

Production Endpoint:

<domain>/api/rest_v1/transform/html/from/{from}

cxserver Endpoint:

<domain>/v1/transform/html/from/{from}

Aug 29 2024, 10:39 AM · Essential-Work, Content-Transform-Team-WIP, LPL Essential (LPL Essential 2024 Jul-Oct), serviceops, CX-cxserver, RESTBase Sunsetting

Aug 28 2024

santhosh added a comment to T368422: Custom translation suggestions: Basic topic selection.

@Isaac Regarding @SBisson's question above, what is the plan about localized topic names? If I remember correctly, the topic list is also going to be more granular soon.

Aug 28 2024, 5:36 AM · MW-1.43-notes (1.43.0-wmf.26; 2024-10-08), LPL Hypothesis, CX-boost

Aug 27 2024

santhosh added a comment to T357950: Remove servicerunner dependency for cxserver.

In T357950#10091560, @MSantos wrote:

This is great work! One question that I have is whether there's a plan to incorporate the changes into the service template. Would that be in scope?

Aug 27 2024, 4:52 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

santhosh added a comment to T369815: Enable in content Translation the new languages Google Translate supports in June 2024.

A) Languages with a Wikipedia and MT support already. We can enable the new support from Google as a non-default to provide them another option, with no need for specific coordination:

Aug 27 2024, 4:44 AM · LPL Essential (LPL Essential 2024 Jul-Oct), ContentTranslation

santhosh added a comment to T369815: Enable in content Translation the new languages Google Translate supports in June 2024.

This issue has been resolved now. API is working as expected now

Aug 27 2024, 4:14 AM · LPL Essential (LPL Essential 2024 Jul-Oct), ContentTranslation

Aug 19 2024

santhosh added a comment to T371465: Deploy Modernized Recommendation API to LiftWing.

A minor issue we need to fix before closing the ticket is to fix broken documentation at https://api.wikimedia.org/service/lw/recommendation/docs

Aug 19 2024, 6:30 AM · MW-1.43-notes (1.43.0-wmf.17; 2024-08-06), OKR-Work, Machine-Learning-Team

Aug 14 2024

santhosh added a comment to T371465: Deploy Modernized Recommendation API to LiftWing.

Currently https://api.wikimedia.org/service/lw/recommendation/v1api/v1/translation?source=en&target=fr&count=12&seed=Apple works.
An ideal api URL should be

Aug 14 2024, 6:43 AM · MW-1.43-notes (1.43.0-wmf.17; 2024-08-06), OKR-Work, Machine-Learning-Team

santhosh added a comment to T369815: Enable in content Translation the new languages Google Translate supports in June 2024.

The https://translation.googleapis.com/language/translate/v2/languages api to list supported languages shows all new languages. However, the actual translation fails for new languages:

{
  "error": {
    "code": 400,
    "message": "Bad language pair: en|to",
    "errors": [
      {
        "message": "Bad language pair: en|to",
        "domain": "global",
        "reason": "badRequest"
      }
    ],
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "target",
            "description": "Target language: to"
          }
        ]
      }
    ]
  }

Aug 14 2024, 5:16 AM · LPL Essential (LPL Essential 2024 Jul-Oct), ContentTranslation

Aug 13 2024

santhosh added a comment to T371465: Deploy Modernized Recommendation API to LiftWing.

Thanks @kevinbazira. I also tested, LGTM.

Aug 13 2024, 6:52 AM · MW-1.43-notes (1.43.0-wmf.17; 2024-08-06), OKR-Work, Machine-Learning-Team

Aug 12 2024

santhosh added a comment to T370755: Caching service request for MinT.

@jijiki That should be ok. Our team capacity is also thin in this month.
Meanwhile, we implemented a diskcache based caching which we plan to use as fallback cache options(for use in dev boxes, testing) etc.

Aug 12 2024, 4:16 AM · Community-Tech (Jackal (not a fox) Fox), serviceops, LPL Essential, Community Wishlist (Translations), MinT

Aug 7 2024

santhosh updated the task description for T357950: Remove servicerunner dependency for cxserver.

Aug 7 2024, 6:15 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

santhosh added a comment to T371465: Deploy Modernized Recommendation API to LiftWing.

@kevinbazira I added CXSERVER_HEADER config value to match the env values in https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1058574/2/helmfile.d/ml-services/recommendation-api-ng/values-ml-staging-codfw.yaml#23

Aug 7 2024, 4:47 AM · MW-1.43-notes (1.43.0-wmf.17; 2024-08-06), OKR-Work, Machine-Learning-Team

santhosh added a comment to T371515: Community-defined Translation Collections: definition, storage & recommendation.

@Isaac, both are good ideas. We had discussed the second one in our team. First one adds the flexibitlity with minimal technical cost from our side.

Aug 7 2024, 3:54 AM · Patch-For-Review, OKR-Work, LPL Hypothesis

santhosh awarded T371934: [medium] Analyze localization and maintenance of translated content a Love token.

Aug 7 2024, 3:34 AM · research-ideas

Aug 6 2024

santhosh added a comment to T371515: Community-defined Translation Collections: definition, storage & recommendation.

@Isaac As a real example to work on for first iteration, I created a simplest version of campaign at https://meta.wikimedia.org/wiki/User:Santhosh.thottingal/Essential_Biography. You can see our campaign marker in the page and the list.

Aug 6 2024, 9:33 AM · Patch-For-Review, OKR-Work, LPL Hypothesis

Aug 5 2024

santhosh updated subscribers of T371465: Deploy Modernized Recommendation API to LiftWing.

@kevinbazira This looks great. There is a minor blocker for production deployment though. Our CX client side code sends query params as s,t, n etc. And new API does not accept them. I submitted a patch for this and if it goes with this week's train, we should be able to deploy new API by early next week.

Aug 5 2024, 6:06 AM · MW-1.43-notes (1.43.0-wmf.17; 2024-08-06), OKR-Work, Machine-Learning-Team

Jul 31 2024

santhosh added a comment to T368713: Community-defined translation collections: Technical Exploration.

As per the discussions regarding early technical iterations towards this goal, we decided the following:

Jul 31 2024, 4:58 AM · OKR-Work, LPL Hypothesis

Jul 23 2024

santhosh created T370755: Caching service request for MinT.

Jul 23 2024, 10:12 AM · Community-Tech (Jackal (not a fox) Fox), serviceops, LPL Essential, Community Wishlist (Translations), MinT

Jul 22 2024

santhosh added a comment to T363468: Refactor Wikipedia Preview into components.

Do we really need an external library here? What are the limitations we see with vanilla js?

Jul 22 2024, 8:19 AM · LPL Essential (LPL Essential 2024 Jul-Oct), Inuka-Team, Wikipedia-Preview

santhosh added a comment to T357950: Remove servicerunner dependency for cxserver.

Node 20 supprtes native testrunner. So we can use that opportunity to untangle our tests from servicerunner. cxserver tests need not depend on service runner, may be it can use express for a test server.
Initial exploration https://gerrit.wikimedia.org/r/c/mediawiki/services/cxserver/+/1055769/

Jul 22 2024, 6:58 AM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Oct), CX-cxserver, Technical-Debt

Jul 17 2024

santhosh added a comment to T368713: Community-defined translation collections: Technical Exploration.

CampaignEvents is installed though - if we want to get an API based output like api.php?action=query&prop=translationcampaign&titles=WikiForHumanRights returning the List page we can enhance that extension. For an MVP, some marker in the page is enough so that these pages can be retrieved using search API, probably using incontent or hastemplate as outlined in https://www.mediawiki.org/wiki/Help:CirrusSearch.

Jul 17 2024, 11:25 AM · OKR-Work, LPL Hypothesis

Jul 10 2024

santhosh added a comment to T367873: Technical exploration to support topic-based suggestions with the current Recommendation API.

I had thought about this topic+article mixing, and I have an idea on its implementation, but just deferred it for another patch once these are merged and tested.

Jul 10 2024, 9:49 AM · Epic, LPL Hypothesis, ContentTranslation

santhosh created T369686: Disallow translating Main Page.

Jul 10 2024, 5:20 AM · ContentTranslation

santhosh added a comment to T368350: Define syntax for defining and embedding a chart.

A suggestion for supporting both server side rendering, and future interactivity at client side:

Jul 10 2024, 5:08 AM · Charts (Sprint 2)

Jul 8 2024

santhosh added a project to T369484: Modernize recommendation API: LPL Hypothesis.

Jul 8 2024, 8:21 AM · LPL Hypothesis, CX-boost, OKR-Work

santhosh created T369484: Modernize recommendation API.

Jul 8 2024, 8:19 AM · LPL Hypothesis, CX-boost, OKR-Work

santhosh edited P65924 Main page translations.

Jul 8 2024, 5:28 AM

santhosh created P65924 Main page translations.

Jul 8 2024, 5:26 AM

Jul 2 2024

santhosh added a comment to T367873: Technical exploration to support topic-based suggestions with the current Recommendation API.

The source code at https://github.com/wikimedia/research-recommendation-api has lot of legacy code, broken or unmaintained dependencies. The web frontend is with bower, jquery and such very old tooling. Recent updates by machine learning team got it somewhat functional to the extend it is integrated to liftwing. But adding new features require more fixups to get a smooth local development experience. We can ignore the web frontend part (AKA - gapfinder) for now as we are interested only in the API.

Jul 2 2024, 11:47 AM · Epic, LPL Hypothesis, ContentTranslation

santhosh added a comment to T367873: Technical exploration to support topic-based suggestions with the current Recommendation API.

My preference is to enhance the "new" recommendation API at https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_content_translation_recommendation so that it can accept a topic(example: Chemisty, History, Africa, Music etc) and give recommendations. It should accept more than one topic. We can also see an intersection of topic and article in later stage.

Jul 2 2024, 7:03 AM · Epic, LPL Hypothesis, ContentTranslation

santhosh renamed T366339: MinT for Wikipedia Readers MVP: Support customizing the "Review the automatic translation" to use the appropriate tool on each wiki from MinT MVP: Support customizing the "Review the automatic translation" to use the appropriate tool on each wiki to MinT for Wikipedia Readers MVP: Support customizing the "Review the automatic translation" to use the appropriate tool on each wiki.

Jul 2 2024, 5:07 AM · LPL Essential, MinT

santhosh renamed T366213: MinT for Wikipedia Readers MVP search not finding an existing language version for the "Tomate" article from MinT MVP search not finding an existing language version for the "Tomate" article to MinT for Wikipedia Readers MVP search not finding an existing language version for the "Tomate" article.

Jul 2 2024, 5:06 AM · Language-Team (Language-2024-April-June), MinT

santhosh renamed T366210: Provide access to more languages in the MinT for Wikipedia Readers MVP Search step from Provide access to more languages in the MinT MVP Search step to Provide access to more languages in the MinT for Wikipedia Readers MVP Search step.

Jul 2 2024, 5:06 AM · LPL Essential, MinT

Jul 1 2024

santhosh claimed T364525: Ignore extra spaces form source text in the MinT test instance.

Jul 1 2024, 10:37 AM · LPL Essential (LPL Essential 2024 Jul-Oct), MinT

santhosh added a comment to T345102: Select the default tab between "Contribute" and "View contributions" to minimize tab switching for users frequently using one of them.

This ticket proposes to adjust the tab that the user navigates to by default to by considering the previous selections, and the existence of previous contributions by the user:

Jul 1 2024, 9:36 AM · MediaWiki-Core-Skin-Architecture

santhosh renamed T359829: MinT for Wiki Readers MVP: Translation options from MinT MVP: Translation options to MinT for Wiki Readers MVP: Translation options.

Jul 1 2024, 5:03 AM · MW-1.43-notes (1.43.0-wmf.13; 2024-07-09), LPL Essential (LPL Essential 2024 Jul-Oct), MinT

Jun 27 2024

santhosh renamed T359863: MinT for Wiki Readers MVP: Explore languages from MinT MVP: Explore languages to MinT for Wiki Readers MVP: Explore languages.

Jun 27 2024, 8:09 AM · MW-1.44-notes (1.44.0-wmf.1; 2024-10-29), MW-1.43-notes (1.43.0-wmf.13; 2024-07-09), LPL Essential (LPL Essential 2024 Jul-Oct), MinT

santhosh added a comment to T338432: Prepare the cxserver for usage without RESTbase.

Internally - in CX production and in our developer workflows we directly use cxserver APIs and not the RESTBase apis like https://en.wikipedia.org/api/rest_v1/#/Transforms/doMT.

Jun 27 2024, 4:53 AM · RESTBase Sunsetting, CX-cxserver

santhosh added a comment to T368437: Mint translating wrong letter in punjabi.

dda + nukta forming the same ligature rendering of rra is a common issue in Gurmukhi fonts. For example Ektype's Mukta has this issue. And this practice of having same shape for nukta form and RRA is not adviced, yet many fonts has them. This is the reason why you see two different shapes as reported above. Common users not aware of this encoding difference, but focusing only in rendering, uses them interchangeably. This wrong usage appears in corpus. For example, in many dravidian scripts I have seen people using 0(zero) in the place of ഠ, :(colon) instead of ഃ(visarga) and so on. A neuaral MT system learns them and the same issues appear in MT output. I have seen this issue in many other languages too.

Jun 27 2024, 4:14 AM · MinT, ContentTranslation

Jun 26 2024

santhosh added a comment to T335491: Provide better long-term storage for translation models.

@elukey Thanks for these details. Currently in our code, models are downloaded using a boostrap shell script(called via docker entrypoint mechanism) using simple wget. These models are then mounted to the docker volume. So our server code just assumes the models are present in a configurable file system location. Do you see any issue if we follow this approach? Does the caveats you mentioned complicate this approach?

Jun 26 2024, 10:26 AM · LPL Essential, SRE-swift-storage, MinT, CX-deployments

Jun 20 2024

santhosh renamed T363183: MinT for Wiki Readers MVP: Access from the mobile language selector from MinT MVP: Access from the mobile language selector to MinT for Wiki Readers MVP: Access from the mobile language selector.

Jun 20 2024, 6:45 AM · MW-1.43-notes (1.43.0-wmf.10; 2024-06-18), Language-Team (Language-2024-April-June), MinT

santhosh renamed T363338: MinT for Wiki Readers MVP: Access from the footer of an article from MinT MVP: Access from the footer of an article to MinT for Wiki Readers MVP: Access from the footer of an article.

Jun 20 2024, 6:45 AM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), LPL Essential (LPL Essential 2024 Jul-Oct), MinT

Jun 13 2024

santhosh added a comment to T352692: Explore possible approaches to support wikitext in MinT.

Round trip technique like wikitext->html->wikitext is one way to achieve this. However it has limitations. For example, if wikitext has a template and one of the template parameter is nested wikitext, we will miss it in html rendering(For example i18n sentences with plural syntax etc). So translation will be incomplete.

Jun 13 2024, 1:15 PM · MinT

Jun 12 2024

santhosh claimed T363563: Avoid references losing their data (showing as plain-text "[1]") when added to the translation using MinT.

Jun 12 2024, 3:41 AM · Wikimedia-Medicine, ContentTranslation, Language-Team (Language-2024-April-June), MinT

santhosh moved T363563: Avoid references losing their data (showing as plain-text "[1]") when added to the translation using MinT from Priority: Translation to In Review on the Language-Team (Language-2024-April-June) board.

Jun 12 2024, 3:41 AM · Wikimedia-Medicine, ContentTranslation, Language-Team (Language-2024-April-June), MinT

Jun 11 2024

santhosh added a comment to T363563: Avoid references losing their data (showing as plain-text "[1]") when added to the translation using MinT.

I was able to reproduce and find out the pattern that cause this issue. Repeated references. Only the first one gets fixed in MT. Second one onwards, it appears plain text. A few months back I had addressed this by keeping a search start in look up logic, but it is not catching repeatations outside the sentence. I am exploring potential solutions.

Jun 11 2024, 7:15 AM · Wikimedia-Medicine, ContentTranslation, Language-Team (Language-2024-April-June), MinT