ZINC subsets

Jump to navigation Jump to search

ZINC is big. Currently, ZINC has over 12M molecules, about 9M of which are commercially available. For most applications, most users of ZINC will only want or need to download a fraction of ZINC: a subset. This article describes subsets.

Property Subsets

Subsets of ZINC by one dimensional physical property (molecular weight, calculated logP) are the single most popular way to acquire ZINC. Of these, the first two subsets, "lead-like" (subset #1) and "fragment-like" (subset #2) are by far the most popular. There are good reasons for this.


Lead-like compounds are large enough to be detected in high throughput spectrophotometric or other cheap assays, yet smaller than most drugs, which have been highly optimized for a specific application. Lead-like compounds will be more soluble, in general, than their bigger "drug like" cousins, and thus more likely to actually be assayed.

fragment like

Fragment-like compounds are even smaller than leads. The good news is, they sample chemical space more throughly than is possible with leads. The bad news is, they are often too small to be detected in a cheap assay, requiring direct biophysical measurement, such as SPR, NMR, or X-ray crystallography.

Together, leads and fragments represent the dominant thinking in the field for screening. The remaining subsets can also be interesting. Here we give a brief explanation of why you might want each one.

drug like

Drug-like (#3) captures the famous rule-of-fives, which itself is just a guideline, to which there are many exceptions. There will be times you may want to screen the "drug like" subset of ZINC, but this would probably be later in the project, after you have had a good look at the leads already, or perhaps there is some unusual circumstance.


Greasy-leads (#4) and Big-n-greasy(#5) are deprecated. Frankly, these compounds are nothing but trouble, since they often do not dissolve. If you really want them back, write me, but otherwise, they are gone.

everything subsets

All purchasable (#6) comes in third place for popularity. Advantage: you can buy these compounds. Disadvantage: for target based virtual screening, many of these compounds will be a waste of time, because they are too big, too specific, and too greasy (insoluble).

Subsets 7,8,9 will return soon...

Everything (#10) comes in fourth place for popularity, since it is, well, everything we can let you have. We frankly don't think you really want this, but people keep asking for it, so, here it is.

Subsets 11-16 will return soon....

fragment variations

Neutral-fragments (#17) are what the name suggests: uncharged fragments. Why would you want this? Charged compounds often have time getting into cells. Docking programs can have trouble weighting among charged and neutral compounds. Wham - put those ideas together and you see why neutral fragments can be interesting.

Subsets 18-28 will return soon....

CNS permeable (#29) are of interest for some projects where getting through the BBB is important. We have used well known criteria for this subset.

Monoanions (#31) and monocations (#32) - don't know why you would want this. We created this for a particular project.

Goldilocks (#33) are yet another set that try to "shoot for the middle" of the chemical space problem and balance the competing advantages and disadvantages of bigger vs smaller molecules.

personal subsets

Piotr (#38), kerim-like (#42), abram (#49) - were all created for specific projects - we do not know why you might want these, but they are available should that be the case.

research subsets

stiff-soluble (#50) and stiffs (#51) are for testing ideas about entropy loss of the ligand on binding. So they are for research, but you might want them too...

Vendor Subsets

We offer subsets by vendor.

User-created subsets of mini subsets

We offer the capability to create small subsets.

User-uploaded subsets

We offer the capability to upload compounds for processing.

By Annotation

We offer compounds by annotation. more soon.

Synthesis on Request

Some vendors offer compounds that they will make if asked, usually within about 10 weeks. We like these compounds, because they greatly expand the region of chemical space one can sample without performing synthesis oneself.

-- John Irwin