Sorting table 7 chia

Caught plotting error: bad allocation #3442

Comments

hardhub commented May 2, 2021 •

Starting phase 1/4: Forward Propagation into tmp files. Sun May 2 19:03:52 2021
Computing table 1
F1 complete, time: 293.682 seconds. CPU (221.09%) Sun May 2 19:08:45 2021
Computing table 2
Caught plotting error: bad allocation
Traceback (most recent call last):
File «chia\cmds\chia.py», line 81, in
File «chia\cmds\chia.py», line 77, in main
File «click\core.py», line 829, in call
File «click\core.py», line 782, in main
File «click\core.py», line 1259, in invoke
File «click\core.py», line 1259, in invoke
File «click\core.py», line 1066, in invoke
File «click\core.py», line 610, in invoke
File «click\decorators.py», line 21, in new_func
File «chia\cmds\plots.py», line 135, in create_cmd
File «chia\plotting\create_plots.py», line 176, in create_plots
RuntimeError: bad allocation
[5076] Failed to execute script chia

It is on Windows only if many parallel processes are running.
Disk space is enough, RAM is enough. RAM is not broken (but I will test more).
Swap is totally absent because not needed (a lot of RAM).
Some n first processes started and work good (no errors).
But next one fails on computing tables.

Are you sure it is hardware issue and not software bug?

OS: Windows 7 x64

The text was updated successfully, but these errors were encountered:

Источник

[BUG] Plot failing randomly #2951

Comments

MasterAssailant commented Apr 28, 2021 •

Describe the bug
I am using GUI to plot. After restarting the GUI it will be able to create a few plots successfully. Then it will randomly fail (i.e. I have seen it fail from 5X% up to 9x%) without showing any error in the log. It just got stuck with no further update in the log and cpu and disk activities back to normal. Have to restart and clear everything to plot until it fails again. It happened when I try to plot both in sequential or in parallel. The plotting SSD doesn’t seem to be overheating. The plotting setting I am using is 6750MB of memory with 4 threads for each queue. I have tried using the default value but the problem seems to persist.

Here’s a sample of the last entries from the log from the most recent crash:

In recent fails I noticed there is an application error entry shown in the windows event viewer. But the latest occurrence there is no such error entry.

This bug is present from version 1.0.X till the latest version 1.1.2.

To Reproduce
Steps to reproduce the behavior:

  1. Perform plotting via GUI.
  2. After finishing maybe 1 to 2 plots then fail.

Expected behavior
Plotting stops, no more log update with no error shown in the log. CPU and Disk activities back to normal.

Screenshots
(Just normal plotting GUI screen with the status stuck at «Plotting» and showing the percentage.)

Desktop (please complete the following information):

  • OS: Windows 10
  • OS Version/Flavor: 20H2 19042.928
  • CPU: i9 10900K
  • RAM 64GB
  • Plotting Temp Disk: Samsung 980 EVO 1TB
  • Destination Disk: WD Red 4TB

Additional context
Just saw another issue about mem leak. Wondering if this can be a possible cause for this issue?
I have also tried closing other programs on my computer while plotting but this problem still exists.
I have other lower spec computer plotting as well but the problem only exists on this computer. Can this be a problem with the plotting disk 980 evo?
Also I noticed the number of process shown on task manager is quite a lot not sure if this is normal.

Thank you so much in advance!

The text was updated successfully, but these errors were encountered:

Источник

Why sometimes stuck at 31% sometimes not #2309

Comments

mzhu46 commented Apr 21, 2021

I noticed there is a scanning process sometimes will happen at around 31% of plotting process.
Is this avoidable?

«Forward propagation table time: 1586.864 seconds. CPU (145.140%) Wed Apr 21 18:39:11 2021
Time for phase 1 = 10210.805 seconds. CPU (144.090%) Wed Apr 21 18:39:11 2021

Starting phase 2/4: Backpropagation into tmp files. Wed Apr 21 18:39:11 2021
Backpropagating on table 7
scanned table 7
scanned time = 28.400 seconds. CPU (92.860%) Wed Apr 21 18:39:39 2021
sorting table 7
Backpropagating on table 6
scanned table 6″

The text was updated successfully, but these errors were encountered:

generalhaar commented Apr 22, 2021

I have been encountering the same thing on ubuntu. I have not been able to complete a plot yet because of it. My logs look exactly the same.

Starting phase 2/4: Backpropagation into tmp files. Thu Apr 22 00:59:02 2021
Backpropagating on table 7
scanned table 7
scanned time = 528.834 seconds. CPU (19.660%) Thu Apr 22 01:07:52 2021
sorting table 7
Backpropagating on table 6
scanned table 6
scanned time = 436.657 seconds. CPU (46.210%) Thu Apr 22 01:45:09 2021
sorting table 6
sort time = 1307.194 seconds. CPU (53.230%) Thu Apr 22 02:06:56 2021
Backpropagating on table 5
scanned table 5
scanned time = 543.170 seconds. CPU (40.130%) Thu Apr 22 02:16:09 2021
sorting table 5
sort time = 1364.188 seconds. CPU (48.930%) Thu Apr 22 02:38:53 2021
Backpropagating on table 4
scanned table 4
scanned time = 525.626 seconds. CPU (38.940%) Thu Apr 22 02:47:53 2021
sorting table 4

Читайте также:  Тезаврационный вид инвестиций это

madmmmax commented Apr 22, 2021

Its pretty normal, The progress is usually increased when the table 2 is sorted.

Christopherkynn commented Apr 23, 2021

Do i need to stop the plot if this happens or is it a long pause ?

mzhu46 commented Apr 24, 2021

Do i need to stop the plot if this happens or is it a long pause ?

probably not, it just takes some time to pass that stage. Once pass that stage, it will go much smoother.

akashkatare5 commented Apr 24, 2021 •

My Internet connection was lost, so I reconnected it, but when I saw the logs, it was stuck at 31% and wasn’t moving forward:

I have read that plotting does not require Internet, but my plotting is stuck in this phase for so long as you can see the time on the right side:

Forward propagation table time: 5508.986 seconds. CPU (44.880%) Sat Apr 24 07:50:11 2021
Time for phase 1 = 29391.886 seconds. CPU (57.300%) Sat Apr 24 07:50:11 2021

Starting phase 2/4: Backpropagation into tmp files. Sat Apr 24 07:50:11 2021
Backpropagating on table 7
scanned table 7
scanned time = 994.599 seconds. CPU (2.490%) Sat Apr 24 08:06:46 2021
sorting table 7
Backpropagating on table 6
scanned table 6
scanned time = 687.552 seconds. CPU (21.760%) Sat Apr 24 09:04:08 2021
sorting table 6

Should I delete all temp files and start again? or let it run like this only?

mzhu46 commented Apr 24, 2021

My Internet connection was lost, so I reconnected it, but when I saw the logs, it was stuck at 31% and wasn’t moving forward:

I have read that plotting does not require Internet, but my plotting is stuck in this phase for so long as you can see the time on the right side:

Forward propagation table time: 5508.986 seconds. CPU (44.880%) Sat Apr 24 07:50:11 2021
Time for phase 1 = 29391.886 seconds. CPU (57.300%) Sat Apr 24 07:50:11 2021

Starting phase 2/4: Backpropagation into tmp files. Sat Apr 24 07:50:11 2021
Backpropagating on table 7
scanned table 7
scanned time = 994.599 seconds. CPU (2.490%) Sat Apr 24 08:06:46 2021
sorting table 7
Backpropagating on table 6

Should I delete all temp files and start again? or let it run like this only?

I am also a green hand so my suggestion may be incorrect. However, if you can see the countdown is continuing and the log page is not freeze. You definitely can wait for the completion of this plot. scanned time = 994.599 seconds is just normal but CPU (2.490%) is too low and maybe that indicate you can plot more in parallel.

akashkatare5 commented Apr 24, 2021

My Internet connection was lost, so I reconnected it, but when I saw the logs, it was stuck at 31% and wasn’t moving forward:
I have read that plotting does not require Internet, but my plotting is stuck in this phase for so long as you can see the time on the right side:
Forward propagation table time: 5508.986 seconds. CPU (44.880%) Sat Apr 24 07:50:11 2021
Time for phase 1 = 29391.886 seconds. CPU (57.300%) Sat Apr 24 07:50:11 2021
Starting phase 2/4: Backpropagation into tmp files. Sat Apr 24 07:50:11 2021
Backpropagating on table 7
scanned table 7
scanned time = 994.599 seconds. CPU (2.490%) Sat Apr 24 08:06:46 2021
sorting table 7
Backpropagating on table 6
Should I delete all temp files and start again? or let it run like this only?

I am also a green hand so my suggestion may be incorrect. However, if you can see the countdown is continuing and the log page is not freeze. You definitely can wait for the completion of this plot. scanned time = 994.599 seconds is just normal but CPU (2.490%) is too low and maybe that indicate you can plot more in parallel.

Actually, it continued for the next line, so will keep it running:

Forward propagation table time: 5508.986 seconds. CPU (44.880%) Sat Apr 24 07:50:11 2021
Time for phase 1 = 29391.886 seconds. CPU (57.300%) Sat Apr 24 07:50:11 2021

Starting phase 2/4: Backpropagation into tmp files. Sat Apr 24 07:50:11 2021
Backpropagating on table 7
scanned table 7
scanned time = 994.599 seconds. CPU (2.490%) Sat Apr 24 08:06:46 2021
sorting table 7
Backpropagating on table 6
scanned table 6
scanned time = 687.552 seconds. CPU (21.760%) Sat Apr 24 09:04:08 2021
sorting table 6

Also, I have set the below settings for plotting:

4770 MiB RAM and 4 threads

I have 8 GB RAM and an Intel i5-9th gen processor.
Will I be able to plot in parallel?

I saw that sometimes my CPU uses fluctuate between 25% to 125% but not stable.

mzhu46 commented Apr 24, 2021

My Internet connection was lost, so I reconnected it, but when I saw the logs, it was stuck at 31% and wasn’t moving forward:
I have read that plotting does not require Internet, but my plotting is stuck in this phase for so long as you can see the time on the right side:
Forward propagation table time: 5508.986 seconds. CPU (44.880%) Sat Apr 24 07:50:11 2021
Time for phase 1 = 29391.886 seconds. CPU (57.300%) Sat Apr 24 07:50:11 2021
Starting phase 2/4: Backpropagation into tmp files. Sat Apr 24 07:50:11 2021
Backpropagating on table 7
scanned table 7
scanned time = 994.599 seconds. CPU (2.490%) Sat Apr 24 08:06:46 2021
sorting table 7
Backpropagating on table 6
Should I delete all temp files and start again? or let it run like this only?

I am also a green hand so my suggestion may be incorrect. However, if you can see the countdown is continuing and the log page is not freeze. You definitely can wait for the completion of this plot. scanned time = 994.599 seconds is just normal but CPU (2.490%) is too low and maybe that indicate you can plot more in parallel.

Читайте также:  Ethereum eth майнинг пул

Actually, it continued for the next line, so will keep it running:

Forward propagation table time: 5508.986 seconds. CPU (44.880%) Sat Apr 24 07:50:11 2021
Time for phase 1 = 29391.886 seconds. CPU (57.300%) Sat Apr 24 07:50:11 2021

Starting phase 2/4: Backpropagation into tmp files. Sat Apr 24 07:50:11 2021
Backpropagating on table 7
scanned table 7
scanned time = 994.599 seconds. CPU (2.490%) Sat Apr 24 08:06:46 2021
sorting table 7
Backpropagating on table 6
scanned table 6
scanned time = 687.552 seconds. CPU (21.760%) Sat Apr 24 09:04:08 2021 sorting table 6

Also, I have set the below settings for plotting:

4770 MiB RAM and 4 threads

I have 8 GB RAM and an Intel i5-9th gen processor.
Will I be able to plot in parallel?

I saw that sometimes my CPU uses fluctuate between 25% to 125% but not stable.

I remember the default setting is to use two threads and divide the # of the cores of your CPU by 2 is a suitable way to calculate the number of parallel ploting which will hit the CPU’s maximum capacity. For me, I once plotted 4 in parallel using a i7-9700k although I feel destressed for seeing my CPU always run at 100 percent for hours and later I cut down to do 3.

Источник

Chia plotting basics

— February 22, 2021

Introduction

First it is important to know that there are two very different parts of being a Chia farmer. There is creating the plots or plotting and then there is farming the plots. In this post we are going to focus on the process of creating your plots. The types of machines and storage space are very different than the types of hardware you ultimately want to use to farm. You can see some example farming rigs on our very useful repository wiki.

We initially recommend that you try plotting with what you have around. The only caution about that is that you want to limit the amount of plots you create that use your internal/consumer grade SSD as the temporary space. SSDs have very different wear lives and we have detailed information on SSD endurance.

You really never need to plot a plot with a k size larger than 32. Those who do plot larger are either doing them to show off (and we encourage this for fun) or to optimally fill the open space on a specific drive. A k32 will take up 101.3 GiB of space once completed but will need a total of 239 GiB of temporary space as it is being created. A single k32 plotting process never needs more than 239 GiB of space. One needs to be careful here as 239 gibibytes uses 1024 as its divisor where GB or gigabytes uses 1000 as the divisor. That means you will need 256.6 GB of temporary space and the final plot file will take 108.8 GB. A k32 plot can be done by one expert we know in just under 4 hours, but most experts are creating plots in 5 hours and most folks average around 9-12 hours.

Creating a plot is a process that will take RAM, CPU cycles, IO to your disks and it will use them differently in each of the four phases of plotting. Everyone wants a magic “right” answer or to use AI to figure out the optimal plotting strategy for their machine. However almost every machine is different along one of these parameters so you just have to try. Longer term we will be able query your machine and make some recommendations but that is not today. You really will have to test. And no, the experts in the various Keybase channels don’t know your best settings either.

Getting going

The first phase generates all of your proofs of space by creating seven tables of cryptographic hashes and saving them to your temporary directory. Phase 2 back-propagates through the hashes, phase 3 sorts and algorithmically compress these hashes in the temporary directory while starting to build the final file and phase 4 completes the file and moves it into your final plot destination.

One of the major bottlenecks is usually the total sustained write speed of the disk underneath your temporary directory. We recommend used datacenter SSD if you really want to go fast and not sacrifice consumer SSDs making plots. NVMe is faster than SAS and SAS is faster than SATA. This PC World overview of storage technologies can explain these acronyms and the differences. TBW, or terabytes written, is generally how SSD drive life is measured. One k32 writes 1.8TiB in non-bitfield mode and 1.6 TiB with bitfield enabled. More on bitfield in a moment.

Making the single fastest plot isn’t generally the best plotting strategy however. Often you’re getting amazing speed because you’re using the turbo core of that multi core processor. The folks who plot the most have shown that you should measure in TB (TiB if you’re old school like us) per day. The way to get the maximum TB/day is to plot lots of plots in parallel. Some of the top plotters use datacenter SSDs. Some use SAS drives. Raid 0 is often very handy to tie together a couple of small fast drives into one say 2TB partition so you could fit 5 k32 temporary spaces on that one virtual RAID drive.

Читайте также:  Оборот инвестиций его стадии

All of that said, for my personal plotting I use a 2017 iMac and a 12TB Western Digital external drive on USB 3.0 for both temporary and final directory, and I get a k32 about every 10 hours.

Good assumptions

There are some good rules of thumb for now. These can change as we will be returning to making some plotting speed improvements after launch. First we need to explain bitfield versus no bitfield plotting. Originally, the plotter did not use bitfield back sorting. The bitfield back sort is theoretically faster than not using the bitfield and we already know that it saves 12% of total writes but requires more RAM. We have a hunch we can speed bitfield up 10% and make it work on more processors but that’s not in there yet. What we do know is that, as long as you’re ok with the 12% more total writes, no bitfield will work faster when SSD or fast SAS is your temporary directory. If your temporary directory is on a regular HDD, like mine is, bitfield is 20% faster than no bitfield. Older CPUs may not see the speed increase as much as noted above.

Returning to the rules, here are a few. Never touch the stripe size of 65536. No one has found a speed up over that value and we are likely removing it from the options list. (Update: as of 3/11/21 stripe size has been removed as an option.) You almost never want to use any bucket values other than 128. Less buckets requires more RAM for each plotting process. 64 buckets requires twice the RAM.

As far as number of threads are concerned you are generally going to want 2 to 4. More than 4 seems to have diminishing returns and 2 threads is a lot better than 1. More threads also require a bit more memory to successfully complete a plot. The threading is only used in phase 1 currently.

As of Chia 1.0.4, RAM requirements are almost identical between bitfield and no bitfield. This is a chart of the various RAM choices assuming a k32 with 128 buckets and 2 to 4 threads:

RAM MiB: Minimum Medium Maximum
Bitfield 900 2640 3400
No Bitfield 900 2640 3400

Below minimum your plot will fail. Medium is enough RAM that you’ll get most speed improvements, but not all. This is useful when you’re trying to get more plotting processes parallel and have limited RAM. Using anything over the maximum is wasting RAM as you will not plot any faster. We are pretty certain of the minimums and maximums but there is community debate about the medium values. We’ll update this chart accordingly as we have better data.

Mastering plotting

Most people start plotting from the GUI. You can successfully complete a couple of plots in parallel from there to get the hang of things. As people choose to get more serious they migrate to the command line. It is worth noting that Windows suffers 5-10% slower plot times versus MacOS or Linux for now.

Once you get some experience you will probably want to know how to create more and more plots in parallel. Luckily we have a replay on YouTube of our cocktails with plotting experts. They had much to share about their various approaches. Some used servers and datacenter SSD, some bought used servers and SAS drives for temporary directories, some expand their consumer/gaming machines, and some focused on lots of smaller used machines. Many of them have compiled a spreadsheet of reference plotting hardware with plot speeds to help get you thinking about any hardware you might want to change or acquire and see how your plotting results measure up.

As you start parallel plotting you need to be careful to not over allocate memory when you are plotting. If you cause your operating system to swap, you are not going to be happy with your outcome. You don’t have to be as careful with thread count.

It is also a very common plotting strategy to plot on say your gaming machine and then move your plots to a Raspberry Pi 4 with a lot of USB ports. All you need is your same 24 word mnemonic on both machines. Alternatively you can just run a remote harvester on your Pi and have it connect to your gaming machine where you are running node and farmer and only have your private keys on one machine.

Learning more

Everyone trying to create plots should read through our repository FAQ. It really does answer 90% of the questions you might have about plotting (and farming.)

Once you have read the FAQ, you’ll find a supportive community in these channels on our public Keybase channels.

Keybase Channel Topic
#beginner For those questions you are afraid to ask
#testnet For all things testnet — an intermediate skill level
#plotting-hardware The expert plotters are here. Hardware, software and plotting strategy

Thanks

@pyl, @kiwihaitch, @psydafke, and @storage_jm all helped out on this post. The mistakes are mine. Should something need to be updated I will edit and post the errata down here.

Updates

As of Chia version 1.0.4, RAM min/med/max values have been updated.

Источник

Оцените статью