2

I work on Kaggle's Galaxy Zoo competition with Keras/Tensorflow, but the huge amount of data (lot of images) sends my computer to limbo. Mine is a more or less ordinary PC (i5) with a generous 48GB of RAM albeit I am unable to utilize my GPU (my video card is not CUDA-compatible). I use Ubuntu&Anaconda combo.

The actual problem is that Python throws "Memory error" while reading in the images from disk to a stacked numpy array. Seemingly my memory is insufficient for the job and I could imagine that the same would be true for any serious task (of course, there are projects beyond MNIST-classification).

So, my question is, what is and how could I get an infrastucture capable of handling jobs of this scale? Actually, what is the real bottleneck here? Memory? The top Linux command shows only about 10% of memory usage in case the running Python process, which is curious.

Of course, I'm not on the level of institutional players so only reasonable costs are acceptable...

karel
  • 110,292
  • 102
  • 269
  • 299
Hendrik
  • 225
  • 2
  • 6
  • Could you post the full python traceback? – don.joey Jan 19 '17 at 10:29
  • Voting close because of interaction in comments on karel's answer – don.joey Jan 19 '17 at 12:42
  • 1
    I'm voting to close this question as off-topic because it's 1) too broad, 2) primarily based on opinion, and 3) doesn't have a strong relation to Ubuntu. If you can narrow its scope and concretise it may fit on [AI.SE], [SO], or more generally [SU]. – David Foerster Jan 21 '17 at 08:57

0 Answers0