Google AI Releases C2S-Scale 27B Mannequin that Translate Complicated Single-Cell Gene Expression Knowledge into ‘cell sentences’ that LLMs can Perceive

October 17, 2025

69

A workforce of researchers from Google Analysis, Google DeepMind, and Yale launched C2S-Scale 27B, a 27-billion-parameter basis mannequin for single-cell evaluation constructed on Gemma-2. The mannequin formalizes single-cell RNA-seq (scRNA-seq) profiles as “cell sentences”—ordered lists of gene symbols—so {that a} language mannequin can natively parse and purpose over mobile states. Past benchmarking good points, the analysis workforce experiences an experimentally validated, context-dependent pathway: CK2 inhibition (silmitasertib/CX-4945) mixed with low-dose interferon amplifies antigen presentation, a mechanism that might make “chilly” tumors extra conscious of immunotherapy. The result’s ~50% enhance in antigen presentation in vitro beneath the mixed situation.

Understanding the mannequin

C2S-Scale converts a high-dimensional expression vector into textual content by rank-ordering genes and emitting the top-Ok symbols as a gene-name sequence. This illustration aligns single-cell knowledge with commonplace LLM toolchains and permits duties resembling cell-type prediction, tissue classification, cluster captioning, perturbation prediction, and organic QA to be phrased as textual content prompts and completions.

https://github.com/vandijklab/cell2sentence

Coaching knowledge, stack, and launch

C2S-Scale-Gemma-2-27B is constructed on Gemma-2 27B (decoder-only Transformer), skilled on Google TPU v5, and launched beneath CC-BY-4.0. The coaching corpus aggregates >800 public scRNA-seq datasets spanning >57M cells (human and mouse) with related metadata and textual context; pretraining unifies transcriptomic tokens and organic textual content right into a single multimodal corpus.

The important thing outcome: an interferon-conditional amplifier

The analysis workforce constructed a dual-context digital display screen over >4,000 medicine to search out compounds that increase antigen presentation (MHC-I program) solely in immune-context-positive settings—i.e., major affected person samples with low interferon tone—whereas having negligible impact in immune-context-neutral cell-line knowledge. The mannequin predicted a hanging context cut up for silmitasertib (CK2 inhibitor): sturdy MHC-I upregulation with low-dose interferon, little to none with out interferon. The analysis workforce experiences in-lab validation in human neuroendocrine fashions unseen in coaching, with the mixture (silmitasertib + low-dose interferon) producing a marked, synergistic enhance in antigen presentation (≈50% of their assays).

The amplifier lowers the response threshold to interferon moderately than initiating antigen presentation de novo; flow-cytometry readouts present HLA-A,B,C upregulation solely beneath mixed remedy (together with IFN-β and IFN-γ), throughout two neuroendocrine fashions, with consultant MFI good points (e.g., 13.6% @10 nM and 34.9% @1000 nM silmitasertib in a single mannequin).

Key Takeaways

C2S-Scale 27B (Gemma-2) encodes scRNA-seq profiles as textual “cell sentences,” enabling LLM-native single-cell evaluation workflows.
In a two-context digital display screen (>4,000 compounds), the mannequin predicted an interferon-conditional amplifier: CK2 inhibition (silmitasertib) boosts MHC-I antigen-presentation solely with low-dose IFN.
Moist-lab assessments in human neuroendocrine cell fashions confirmed the prediction, with ~50% antigen-presentation enhance for silmitasertib+IFN versus both alone; this stays preclinical/in vitro.
Open weights and utilization docs are reside on Hugging Face (vandijklab) with each 27B and 2B Gemma variants for analysis use.

C2S-Scale 27B is a technically credible step for LLMs in biology: translating scRNA-seq into “cell sentences” lets a Gemma-2 mannequin run programmatic queries over cell states and perturbations, and in apply it surfaced an interferon-conditional amplifier—silmitasertib (CK2 inhibition)—that will increase MHC-I antigen presentation solely with low-dose IFN, a mechanism the workforce then validated in vitro. The worth right here isn’t headline rhetoric however the workflow: text-native screening throughout >4k compounds beneath twin immune contexts to suggest a context-dependent pathway which will convert immune-“chilly” tumors towards visibility. That mentioned, all proof is preclinical and bench-scale; the best learn is “hypothesis-generating AI” with open weights enabling replication and stress-testing, not a medical declare.

Take a look at the Technical Paper, Mannequin on HF, GitHub Web page and Technical particulars . Be at liberty to take a look at our GitHub Web page for Tutorials, Codes and Notebooks. Additionally, be at liberty to observe us on Twitter and don’t neglect to hitch our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you possibly can be part of us on telegram as nicely.

Michal Sutter is a knowledge science skilled with a Grasp of Science in Knowledge Science from the College of Padova. With a stable basis in statistical evaluation, machine studying, and knowledge engineering, Michal excels at remodeling advanced datasets into actionable insights.

🙌 Observe MARKTECHPOST: Add us as a most popular supply on Google.

Previous articleUnique Insights: “With Spin Memristor, we’re bringing the mind’s analog intelligence to trendy reminiscence expertise,” says TDK’s Gagan Bansal.

Next articleNon-Contact Movement Sensor For Manufacturing Automation

Google AI Releases C2S-Scale 27B Mannequin that Translate Complicated Single-Cell Gene Expression Knowledge into ‘cell sentences’ that LLMs can Perceive

Understanding the mannequin

Coaching knowledge, stack, and launch

The important thing outcome: an interferon-conditional amplifier

Key Takeaways

An Implementation to Construct Dynamic AI Techniques with the Mannequin Context Protocol (MCP) for Actual-Time Useful resource and Instrument Integration

Microsoft AI Proposes BitNet Distillation (BitDistill): A Light-weight Pipeline that Delivers as much as 10x Reminiscence Financial savings and about 2.65x CPU Speedup

Weak-for-Robust (W4S): A Novel Reinforcement Studying Algorithm that Trains a weak Meta Agent to Design Agentic Workflows with Stronger LLMs

LEAVE A REPLY Cancel reply

Most Popular

Wayve raises $1.2B with plans to carry robotaxis to London

floLIVE reimagines connectivity for clever IoT ops

Teledyne FLIR launches Lepton XDS thermal/visible digicam module

Combine Instagram login in iOS – Tutorial – iOSTutorialJunction

Recent Comments

ABOUT US

POPULAR POSTS

Wayve raises $1.2B with plans to carry robotaxis to London

floLIVE reimagines connectivity for clever IoT ops

Teledyne FLIR launches Lepton XDS thermal/visible digicam module

POPULAR CATEGORY