Downsides of using Common Crawl
Took a look at the Common Crawl data I myself pre-processed last year and could not find abstracts - only sentences.
Took a look at these - archives - data.statmt.org/ngrams/deduped/ - also only sentences, though they seem to be in logical order sometimes.
You can use any form of CC - but only to learn word representations. Not sentences.