0

I seek to use my 500GB NVMe SSD as pseudo-RAM for deep learning by increasing pagefile size. My RAM's 28GB, I seek 64GB+. From the limited info I could find on DDR4, this SSD appears to have 10% its read speed - excellent, and sufficient for my application.

Reading up on others doing the same, advice has been shifting from "SSD funeral" to "no longer bad" - yet I remain uncertain, especially for my application. One train epoch reads 64GB of data, but writes ~zero. The linked SSD boasts 1.2 Petabyte writes, but no specs on reads; I presume it's much greater, if at all relevant? Assuming equal, the SSD can handle 18,000+ epochs - which makes degradation a non-issue.

This said, what's the status on this matter now, in 2019?

(As a side question, is its 2,280 Gb/s data transfer rate spec a likely typo?)

  • 1
    Pagefile does not give you extra RAM. It's used to page out programs that don't need RAM _at the moment_. If you need more RAM in an instance of time than you physically have, pagefile won't help you. – gronostaj Sep 05 '19 at 11:09
  • 1
    @gronostaj I didn't claim it gives extra RAM - I said "pseudo-RAM" - and frankly, for my application, it _is_ extra RAM; the SSD's read speed is 10% of my RAM's - more than enough for unnoticeable data load time – OverLordGoldDragon Sep 05 '19 at 14:30
  • _"One train epoch reads 64GB of data"_ so if I understand correctly, you need 64 GB used all at once. Pagefile won't help you with this. – gronostaj Sep 06 '19 at 06:33
  • @gronostaj Only 480MB at a time, not all at once - I've done the benchmarks with Numpy memory mapping: a memmapped array loads in 0.3 secs, vs 5 secs unmapped, and that's the figure on my HDD snail - i.e. the SDD should pull < 0.01 secs. (But yes, I do intend to map all 64GB onto pagefile - it shouldn't be a problem, correct?) – OverLordGoldDragon Sep 06 '19 at 13:41
  • I'm not sure, but I'd first give it a try on a HDD before you spend any money on this. At this rate filling your 28 GB should take about a minute. – gronostaj Sep 06 '19 at 14:15
  • @K7AAY Any input on this? Also, correction - I have a hybrid drive, and pagefile's defaulted to C-drive with the "SSD portion" - as for testing, it works for up to 5GB of memmapping, though didn't try increasing pagefile size as disk space's limited - but can free if needed – OverLordGoldDragon Sep 06 '19 at 14:21

2 Answers2

1

2) 3,500 MB/s is more realistic than 2,280 Gb/s; that's the number they hang out front.

1) Writes stress the drive much more than reads, so if your app is very read-heavy, you just might get that 600 TB of writes out of the drive. I would watch the writes like a hawk, and if you ever get within 2 orders of magnitude within 600 TB written (that is to say, 6TBW), expect it to go boom, and get that warranty claim form handy.

This article points to degradation caused by reads which points to a need to rewrite every 100K reads, a task handled by the firmware in the SSD. As you stated in your second graf, that is much lesser than the writes.

You may also enjoy this thought-provoking review with lots of chewy numbers.

3) Of course, there's that idle power consumption ceiling of 30 Megawatts to contend with; have you considered moving to Grand Coulee Dam, or somewhere in the Tennessee Valley Authority?

idle power consumption ceiling of 30 Megawatt

K7AAY
  • 9,512
  • 4
  • 33
  • 62
  • 1
    I didn't even notice that power spec - guess I'll check Amazon for thorium. As for "you just might get that out the drive" - i.e. good to go? To clarify, my usage read-to-write ratio is ~5000:1 – OverLordGoldDragon Sep 04 '19 at 22:55
  • 1
    @OverLordGoldDragon Brazil is estimated to have 16 kilotons of Thorium reserves https://en.wikipedia.org/wiki/Occurrence_of_thorium#Thorium_reserve_estimates, so the Amazon could be a good place to get it. My point about #3, however, is if Big River Warehouse Company has one typo for that product, they could well have another, so I would check Samsung's spec sheet https://www.samsung.com/semiconductor/minisite/ssd/product/consumer/970pro/ to see what they have to say. – K7AAY Sep 04 '19 at 23:13
  • 1
    Thanks for the link & reads - the review is fairly thorough. My takeaway @"degradation caused by reads" is that the re-writes are a non-issue - if 100,000 reads are worth one write, I have virtually infinite reads. The "maintenance"/background ops should also be a non-issue. Overall, a green light - fair? – OverLordGoldDragon Sep 04 '19 at 23:39
  • 1
    IMHO, yes. If this a good answer, please check the up arrow head in grey to the left of the answer; if this absolutely solves your problem and is the best answer, please also click the check mark. – K7AAY Sep 04 '19 at 23:43
  • 1
    Arrow's done - as for accepted answer, I'll wait a little to keep the question alive for any dissenting answers. Either way, thank you for the useful info & humor – OverLordGoldDragon Sep 04 '19 at 23:46
  • 1
    The skeptics were wrong - _it worked_. Now figuring out some optimizations - will make a post eventually on what should be a ubiquitous data pipeline practice. – OverLordGoldDragon Sep 11 '19 at 14:57
  • 1
    43-FOLD speedup in data load time, and **37-FOLD SPEEDUP in train time**. This is insane. The entire DL community committed a crime by not teaching this simple method for effectively using SSD as RAM. Furthermore, I think these figures can be increased to ~130 and ~70, respectively, though I won't get into it anytime soon - unless someone answers [this](https://superuser.com/questions/1481251/how-to-store-data-in-pagefile). – OverLordGoldDragon Sep 11 '19 at 22:53
  • 1
    Be interesting to look at those numbers and see if they decline over time, say, 10,000 runs from now. – K7AAY Sep 11 '19 at 23:01
0

The number of reads are a non-issue, its generally writes that cause degradation - and SSDs are also more robust then originally believed (although when they do fail its often sudden and catastrophic).

2280 megabytes per second is entirely believable for nvme SSD. SATA based ones are a lot slower and 500 megabytes/sec would apply to those.

davidgo
  • 68,623
  • 13
  • 106
  • 163