Building my own miner


#1

Back in September 2017 I started to get into Mining. I have Nheqminer source code on the PC and got a 1060ti GFX card to do some hashing too.
I am a hardware designer and program software and firmware. FPGA design work for almost 2 decades now. So I thought I’d set my self a challenge. Can I make an FPGA out perform an i7 processor? or maybe even a GFX card?.

Well through many ups and downs, I am getting close. The main bottle neck being access to RAM. The FPGA dosn’'t have enough RAM to do all the processing internally. Really you need around 180Mb, Now FPGA’s with that amount of RAM are available, but at a huge cost like £1000’s. My aim was to run an FPGA for around £30 with 256Mb of DDR ram.

I have been looking at CPU Tromps code, and looking at the Digit1 to Digit8 code. Its reading all the hashes generated every time. So Reading 64Mb, writing 64Mb, in the early stages, and Reading/Writing less towards the later stages.

So we need to move around 500Mb of RAM to get around 2 solutions. The DDR3 RAM I’m using is 1066mb/s per output(16 bit chip) which is 2Gbytes per second max, this is not including refresh cycles. So I’m not going to get more than 8 sols/s using this RAM, although could have more than one RAM widening the data width, but then need more FPGA pins, therefore more price. The Core I7 is doing around 14 sol/s.

What I am wondering is there another way to so the Digit1 to Digit 8 sorting. I thought it was just looking for duplicates, but tracking back the solution indexes, the source 192bits are not identical.

I still have much code to convert to FPGA code, but getting figures from the maximum the DDR rates and how much data I need to move around gives me an idea of what hashing speed it to be expected.
I have the hashing code in the VHDL and its generating that as quick as DDR can store it.
The sorting code is still in C so I can work out the best route to move forward. I have around 5Mb of internal FPGA Block RAM, So will pipeline reading and writing from DDR.

What I need to do is reduce the data transferred to and from the DDR. The data is stored into 4096 buckets, I can take a whole bucket and sort it, it would be nice to process everything one bucket at a time, instead of moving to the next ‘Digit’ and going through all the buckets again.

The Binary Tree generation is working well after the Digit1-8. Then the SHA256 is complete in 30uS and compared with the target, but that is nice and whizzy, and correct to how the PC code is doing it.

I have written many documents over the year with plans on what to do, so tackling this from many areas. Using a Xilinx Zynq Z020, which gives me a LWIP tcp/ip connecton to the internet, and then the fetched work with be shared out to other FPGA’s, perhaps a Spartan-7 or Artx-7 each with their own dedicated DDR3 ram.

One issue I get at the moment, it will connect to Slushpool, sign in, grab work, process that work, then it will send back a result, but then that result is neither refused or accepted. Not sure what is going on there, I see in Wireshark that the packet gets split as its over 1500 bytes. The PC sends results in 2800byte packets, so thinking it was packet size, but then ran the PC code again this morning, and that was fragmenting the TCP/IP packets and was being accepted. So looks like Slushpool accepts fragmented packed from the PC, but not from my miner.

So still a long way to go, any pointers from anyone will be very useful.
I know I’m not going to compete with ASIC’s. The FPGA code could be converted to an ASIC but has a huge up front cost. Its mainly a private project, that I may make a couple of machines with 64 FPGA’s or 128, all communicating with the FPGA that’s connected to the internet.


#2

Hi @TransAmDan welcome to the Forum! I haven’t personally seen someone build successful code for a FPGA for Equihash, so you would be the first :grinning:

Here are links to threads where the idea has been discussed before, maybe you could dig through them and find something useful or someone to team up with to tackle the project:

Good luck!


#3

Thanks for the reply and all the links.

I’m still progressing with this. working on many areas at a time, so I’ll post more details later, on how I tackled the blake2b hashing in VHDL. Sorting is the next bit thing, this is taking 4 seconds in C code on an 800Mhz processor, so I’m in the process of optimising that which will also give me more accurate results on how much ram I need to shift. By knowing how much RAM i need to shift we will then know the theoretical maximising speed that can be achieved with particular RAM.

I have also been researching DDR RAM going 32 bit would give us more bandwidth than a 16bit, and less pins than using 2 x 16bit RAM. I wish to use ISSI DDR Ram, the reason being is that in the Xilinx tools these ISSI devices already have all their parameters entered. Now the only ISSI 32bit RAM chip is 800Mhz, where 16bit ones are 1066Mb/s. As bandwidth is the goal, the max 32bit throughput at 800Mhz (excluding refresh cycles) is 3.2giga bytes/s. Where 1 x 16 bit * 1066 Mb/s is 2.12giga bytes/s. So I have a few pro/s cons to weigh up for RAM selection, but that is far further down the road when I design the PCB.

Today I am using the ArtyZ7 board by Digilent which has a 256M x 16bit DDR ram 3 on board with a XilinxZ020 FPGA, this has around 75k logic cells and a dual core arm processor. My aim is to get it running on this even is its 4sol/s, then use many other custom designed FPGA boards to increase the throughput.
Last night I am happy with the Stratum connection, signing in for work, grabbing work, checking all parameters of the protocol including the ‘clean job’ flag. Updating nonce, getting target information. Its going good.
Today I am forcing it to work with work that I grabbed with Wireshark, so both the PC and ArtyZ7 are working with the same information, I’ll step through the code on both to check all results are identical. Each block on the ArtyZ7 board has been individually tested, I’ts only recently been all chained together in the last week, so something may be incorrect, so need to check it all from begging to end.


#4

Very Interesting, it sounds like you are already fetching work from the pool successfully.

Since you are thinking about chip selection, the Bitmain ASIC uses a chip called BM1740 which has 144 MB of RAM which (unfortunately) is enough to run Equihash (at the current specs of 200,9) Perhaps the developers will change these specs in the future.

Z9 chips/teardowns:

https://bitcointalk.org/index.php?topic=4534369.0

From what I’ve seen FPGAs have a very steep learning curve and coding them properly is one of the main reasons they aren’t used more. I’ve seen one YouTuber asking for help with a FPGA that he bought but couldn’t get it working without code

I am a proponent of getting Zcash mining hardware into as many hands as possible which is why I supported GPU miners and am not a big fan of ASICs. What if FPGAs are a mining solution?

Quick brainstorm:
FPGAs are widely available (assumption)
Zcash Foundation sponsors coders (like you) to release open-source FPGA software for everyone.
Zcash Parameters are FPGA friendly and change every few months and coincide with the FPGA software release.
Could that help level the playing field between FPGAs and ASIC? To match this ASICs would have to become FPGAs like the Blackminer F1 :thinking:


#5

Just out of curiousity, how many miners do you really know that run a FPGA? I personally know a lot of miners, gpu and asics miners, but not a single FPGA miner.


#6

Thanks for the info Shawn. That’s some great tear down info on the Z9 chip.

Memory on chip is the killer for FPGA’s and having it off chip you quickly come into bandwidth issues. My design wont be as kick ass as the Z9, however if I can get 30sol/s on one chip I’d be well happy, and its looking like it may be achievable with 4x 16bit DDR chips.

I’ve been looking at my code from 10months ago, this is when it generated the 2million 192bit results. Checking I’m passing the header blocks and nonce to it correctly. In the Tromp C code it applies the salt code. This is all fixed information, so applying this to the initialising hash before compiling. Then we can use this modified initialising hash as a short cut. I’ve made a few optimisations like this. Luckily I documented it, as I was starting to get a bit lost following it after so many months.

Before an ASIC is made usually people run tests on an FPGA first. The same code can be used. The sort of FPGA’s I’m coding can be put into an ASIC for like $1 each ASIC, however you have to pay a huge development cost to have your design converted from FPGA into an ASIC. A lot of companies go this route as you get confidence of the design in the FPGA first. Plus going to an ASIC on a smaller nm scale, you also get a speed increase too. Sadly I’m not a business, just working on this in my free time, although this is really chewing up all my free time. Even when I’m out walking the dogs I’m constantly thinking about this puzzle, and going to sleep with it on my mind and waking up with it still on my mind, its been like that for over a year. I will achieve the end solution. its not far off now. It is very close to submitting a solution, hopefully before New Year.


#7

Still making progress on this. Was aiming for it to submit a results and to be accepted before New years Day. It is soo close now.
I ran nheqminer on the PC, capturing in wireshark. I see the nonce, target, and all the other data. I see it also submit a result.

I ran my FPGA with the same source information. I get the result. A quick visual of the first few hundred and the last characters of the solution are exact. Run with live data, the result is submitted but I get no reply. Perhaps there is a comma or something in the wrong place. I’ll do a more careful compare of the results to check they are identical. If the payload of the TCP/IP packets are the same (result is around 2800 bytes so packet is split, but at a slightly different place to the PC) perhaps there is something in the header of the TCP/IP packet that needs jiggling. Getting very close now. Been spending around 10 hours a day on this for almost 2 weeks solid. .


#8

Thanks for your efforts here, but try to keep a hobby/life balance. This is an interesting project but it’s not critical in the grand scheme of things. Zcash isn’t going away anytime soon so take your time. This is a marathon not a sprint :wink:


#9

There are unreleased Equihash bitstreams listed here . http://zetheron.com/index.php/downloads/.

I have heard that a brute force approach may be possible with only on chip memory and many blake2b rounds. Low efficiency but very high performance.


#10

Hello. What’s the chances of mining a block reward solo? Interested in mining long term. What kind of coin numbers do you guys accumulate? I don’t care about the coin price but number of coins. Obviously it feels better to mine when you gain a little more than the current price. Excuse to not buy coins.

I’m building some smaller stuff and are testing Zcash some for products or and in combination with other blockchains.

All the best


#11

Hey ya, solo mining requires alot of hardware and investment. Since ZEC is currently a ASIC coin you will need specialized hardware now, you cant mine it with GPUs(well you can, but it will lose money).

Looking at the website
https://www.coinwarz.com/calculators/zcash-mining-calculator

You can enter your hashrates in and see how many days it might take to find a block.

At 50k Sols, about 1 Z9 running at max you get
Days to generate one block mining solo: 77.73 Day(s) (can vary greatly depending on your luck).
Could also take a year, or a day, how lucky do you feel?

So unless you can get about 10+ Z9s for about 1 block every 7 days, you might want to join a pool to mine. You get almost the same amount of money, but its alot more consistant.

One Z9 is making at the current difficulty,changing all the time, making it harder(less coins).

Time Frame ZEC Coins BTC (ZEC/BTC at 0.01514250) USD (BTC at $3,959.70) Power Cost (in USD) Profit (in USD)
Hourly 0.00536031 0.00008117 $0.32 $0.06 $0.26
Daily 0.12864739 0.00194804 $7.71 $1.49 $6.23
Weekly 0.90053170 0.01363630 $54.00 $10.42 $43.58
Monthly 3.85942155 0.05844129 $231.41 $44.64 $186.77
Annually 46.95629558 0.71103571 $2,815.49 $543.12 $2,272.37

In the current market, your still better off buying the coins, they are on sale right now! 90% off, limited time buy now!..lol

Good luck!


#12

Hello. Thank you for reply! I Will probably get some later on. What sparks my interest in Zcash is that it could help one of my projects become fully secure and that means being able to save lives.

Mining solo sound more fun as I like to have some luck. A one day event would be a life changing event! Like winning lotto. it Don’t feel safe to invest in a Z9 with current noises. Is it the only real option?


#13

Well the miners were selling for 2Kish USD when they came out. They were making close to 30-40 USD a day. Now a few months later, beceause the difficulty has gone up from the new miners mining faster, they will only make about $6USD a day.

There is lots of risk with buying miners right now. There is risk with buying coins also. However, you have to remember the miners will cost you electricity costs in the mean time, so you have a upfront cost + a recurring bill. If Zcash keeps dropping, or if more miners are added, the profits on the miners will continue to decrease. And will eventually not make any more and cost money to run.

Purchasing coins, you will always have a set value that you purchased them at. If Zcash goes above that value, you make money. Alot less things to consider. With the resell value of ASICs almost being zero right now. You will be stuck with them if the stop making money, with nothing to show except the few coins you mined before they started losing you money.

I doubt the current ASICs will stay profitable for 3 more months, but I could be wrong.


#14

Fair point. I think Zcash can become one of the greats later on towards the future. It could well reach over 1 trillion in market cap.

We haven’t even seen the beginnings of the potential.

Would love to be able to mine and accumulate my own coins. Buying feels so much like stocks!


#15

If the market was back in the end of 2017 at its all time high, I would say buy the miners.

However we are currently so close to the bottom in this market(I hope)? Now is the best time to pick up coins for super cheap. Can it drop lower? Yes it probly can. If you belive in the tech it shouldnt matter tho.

Example. Zec = $58 USD.
Z9 Miner = 2.5k(cost) + 500usd(Electricity) = 3k USD Total = 46 coins mined in 1 year.
3k USD = Buy ZEC at $58 USD = 51 ZEC purchased.

With the way the difficulty has been going up, it almost makes no sense to risk buying a miner over the coins. You will get about 3 zec difference, however you have 1 year of RISK that the mining difficulty wont change with the miner. And that is stupid to think like that, because it changes constantly and will continue to do so.


#16

So we are moving towards first ASIC resistance and eventually POS of some form. It makes sense?


#17

#18

In addition to my reply 12 days ago I’m still working on this. Managed to get the TCP/IP results sent quicker, and no gap between packets. So I believe I’m sending the correct results back, but then the server disconnect the connection. i can see this in Wireshark the ‘FIN’ flag gets set in response…

I know its not going to be profitable to get this FPGA going dues to ASIC’s, however it would be great to get this going and see results, even if they are very little.

Is there a place I can check results? Like taking the nonce, header, time, difficulty etc…, and checking against my result. I have checked my results again the PC application NHeqminer, and I’m pretty sure me results are the same, but something not working correct.

I have read in the forum that many people have started an FPGA design, but not yet heard of a completed one, it would be nice to be the first.

My design is currently on the Xilinx Zynq board, this has a dual core ARm processor running at around 700Mhz, I’m only using one core at present, this services the TCP/IP. As work comes in, this is sent into the FPGA fabric, and hashing begins, at present the sorting is done in C code in one of the CPU cores, so not fast, but proves a point. Then difficulty filter is done in one of the CPU cores, and a solution that passes, is then sent out via TCP/IP, nonce incremented and it carries on.

Looking at the TCP/IP packets there are looking just like the ones from Nheqminer, apart from the 2830ish byte solution that is sent is split in a slightly different place, my ones are split at 1500, but this should be okay, as the extracted payload data is the part that matters and that looks good.

I’m not going to give up, started this well over a year ago, come to many hurdles, and found solutions. Its been great learning so much about this FPGA/Processor, at least when its done and making little return I would have gained knowledge along the way of this project.


#19

just as a side note. Maybe your work should be more towards equihash 150,5 the new 2nd ZEC algo coming in about 9 months designed for gpu mining.

I mean, on the current algo you are behind, not only techically working versus asics that are more efficient on the current algo but as well you aren’t ready yet. This said working on your FPGA and preparing for equihash 150,5 would maybe result in earning at least some fruits for all your hardwork and you would be more competive versus gpu’s than versus asics.

Just a thought…


#20

That is a great thought. Its not impossible to move to to a different form. I’ve not heard of 150,5. I have seen 144,5 that requires 2Gb of RAM, and as you say could bare some fruits. I have 512Mb of RAM on this board, but I can make a board with 8Gb so I have some room for expansion, it will be DDR3 so not quite as fast as GFX card ram.
Thanks for the idea, this could be my new path…