What is the expected cpu resource consumption of zcashd?


#1

On my four core laptop, zcashd seems to be taking 100% one core to just operate in daemon mode.

I think I need to start over... Which reminds me - how can I verify that I've got an accurate copy of the proving keys?


#2

I just binned the old directory, did a fresh git clone, recompile and ran the full test suite... Then I started zcashd in daemon mode - it took a little while but the cpu use of a particular core did eventually rise to 100% although the pattern of activity is a little different this time...


#3

Are you mining? Check if gen=1 in bitcoin.conf.


#4

My bitcoin.conf file only has entries for testnet, addnode, rpcuser and rpcpassword. But I'll add gen=0 to see if that makes a difference. Thank you for the suggestion!

Edit: It didn't seem to make a difference... Also, is this normal? zcashd is running in the background and my internet connection is up...

~/zcash$ ./src/zcash-cli getinfo
{
"version" : 110200,
"protocolversion" : 70002,
"walletversion" : 60000,
"balance" : 0.00000000,
"blocks" : 0,
"timeoffset" : 0,
"connections" : 0,
"proxy" : "",
"difficulty" : 0.00000000,
"testnet" : true,
"keypoololdest" : 1460456836,
"keypoolsize" : 101,
"paytxfee" : 0.00000000,
"relayfee" : 0.00005000,
"errors" : ""
}


#5

Sounds like this bug to me: https://github.com/zcash/zcash/issues/717

How about someone run zcashd under valgrind, address sanitizer ("asan"), or a CPU profiler tool and see if that tells us where the busy loop is?


#6

I just installed v11.2.z2 and watched varied cpu activity while (I assume) the new testnet blockchain downloaded but then the previously mentioned cpu activity resumed - 100% on a single core.


#7

Same issue here. Clean Ubuntu 14.04 install, zcashd takes 100% cpu of a 4 core Core i5 desktop.


#8

Same here. Debian 8.4 jessie, clean install, nothing but the basic system, and the compilation prereqisites. No mining, 30 minutes after startup zcashd is consuming 100% of one CPU on a 4 CPU virtual machine inside VirtualBox.

Similar results on Debian 7.10 wheezy.

@zooko: If you wish, I can package the jessie virtual machine for your inspection.


#9

Is anybody out there a C++ wizard and can debug https://github.com/zcash/zcash/issues/717?


#10

I just wanted to chime in with the same issues. Zcashd maxes out a single thread at a time on 8 core cpu, plus no coins are generated.
Running Ubuntu 16 in VirtualBox


#12

EDIT- nevermind, I misunderstood that other post.

Now if we could only dig up a C++ guy to fix it...


#13

That guy did NOT mined 300+ blocks with Macbook Pro in 24 hours. That was amount of total block generated by whole network !

Two blocks mined in 16 hours with 2 mining threads is nothing out of ordinary with given difficulty on testnet. This thread is discussing CPU consumption WITHOUT mining activity. The problem seems to be occurring in every system.


#14

Ah, lol I mis-read the error he mentioned as his machine only.:sweat_smile: That makes more sense.


#15

I'm not a wizard but I took a quick look.

On my system I am seeing 100% on a single thread. running in gdb or "top -H" identifies the busy thread as "bitcoin-opencon" and specifically it is spending time in function ThreadOpenConnections() inside addrman.Select().

Here's a stack trace.

(gdb) where
#0  0x0000555555a709c0 in GetRand (nMax=64) at random.cpp:106
#1  0x0000555555a70a0a in GetRandInt (nMax=64) at random.cpp:111
#2  0x0000555555965c43 in CAddrMan::Select_ (this=0x5555565efd20 <addrman>) at addrman.cpp:360
#3  0x000055555584dbbd in CAddrMan::Select (this=0x5555565efd20 <addrman>) at addrman.h:541
#4  0x0000555555843051 in ThreadOpenConnections () at net.cpp:1274
#5  0x000055555585397b in TraceThread<void (*)()> (name=0x555555fb2a8e "opencon",
    func=0x5555558428c5 <ThreadOpenConnections()>) at util.h:217
#6  0x000055555588f106 in boost::_bi::list2<boost::_bi::value<char const*>, boost::_bi::value<void (*)()> >::opera
tor()<void (*)(char const*, void (*)()), boost::_bi::list0> (this=0x555557ea4ab0,
    f=@0x555557ea4aa8: 0x555555853912 <TraceThread<void (*)()>(char const*, void (*)())>, a=...)
    at /home/zcash/zcash/depends/x86_64-unknown-linux-gnu/include/boost/bind/bind.hpp:313
#7  0x000055555588eb06 in boost::_bi::bind_t<void, void (*)(char const*, void (*)()), boost::_bi::list2<boost::_bi
::value<char const*>, boost::_bi::value<void (*)()> > >::operator() (this=0x555557ea4aa8)
    at /home/zcash/zcash/depends/x86_64-unknown-linux-gnu/include/boost/bind/bind_template.hpp:20
#8  0x000055555588e3a9 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(char const*, void (*)()), b
oost::_bi::list2<boost::_bi::value<char const*>, boost::_bi::value<void (*)()> > > >::run (this=0x555557ea48f0)
    at /home/zcash/zcash/depends/x86_64-unknown-linux-gnu/include/boost/thread/detail/thread.hpp:116
#9  0x0000555555be28ba in thread_proxy ()
#10 0x00007ffff71a1f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007ffff6ecba0d in clone () from /lib/x86_64-linux-gnu/libc.so.6

I was able to get the CPU down to 1% with this patch:

$ git diff
diff --git a/src/addrman.cpp b/src/addrman.cpp
index c41ee3f..1886a83 100644
--- a/src/addrman.cpp
+++ b/src/addrman.cpp
@@ -341,6 +341,7 @@ CAddrInfo CAddrMan::Select_()
         // use a tried node
         double fChanceFactor = 1.0;
         while (1) {
+            MilliSleep(10);
             int nKBucket = GetRandInt(ADDRMAN_TRIED_BUCKET_COUNT);
             int nKBucketPos = GetRandInt(ADDRMAN_BUCKET_SIZE);
             if (vvTried[nKBucket][nKBucketPos] == -1)
@@ -356,6 +357,7 @@ CAddrInfo CAddrMan::Select_()
         // use a new node
         double fChanceFactor = 1.0;
         while (1) {
+            MilliSleep(10);
             int nUBucket = GetRandInt(ADDRMAN_NEW_BUCKET_COUNT);
             int nUBucketPos = GetRandInt(ADDRMAN_BUCKET_SIZE);
             if (vvNew[nUBucket][nUBucketPos] == -1)

However!!!
(1) I'm sure this is not the correct fix.
(2) zcashd is not making any connections after I applied this patch.

But maybe it can point someone in the right direction...


#16

What is interesting about that fix is, it touches a file that we have never edited - ie. that CPU consumption behaviour should also be present in Bitcoin Core 0.11.2.

EDIT: Having said that, this file concerns address management, and our testnet is currently running with a much more centralised architecture than Bitcoin Core, so it is possible that we are just triggering an edge case that was never encountered upstream.


#17

This appears to have come up with Bitcoin, see the relevant github issues here:
github.com/bitcoin/bitcoin/issues/1664

It appears they believed this problem fixed.


#18

Interesting, that seems like possible explanation since we are currently operating in a testnet environment, @str4d @zooko


#19

See https://github.com/bitcoin/bitcoin/issues/6903#issuecomment-218357010


#20

Just as an update: it appears that they may be close to a fix with this CPU consumption bug

Bitcartels solution is not yet implimented but seems close: https://github.com/zcash/zcash/pull/929

Nice work! @bitcartel


#21

What kind of address is the Select() function trying to find? Why isn't it available when operating on a testnet?