1.6 fix 1234
Okay, I'm happy to present you an official fix for issue #1234 (closed) that should definitely fix the problems that both @librelois and @moul could have encoutered in the past with PoW.
To sum up, this fix:
- wraps proof.ts code in a function, to avoid global conflicts
- move specific worker code into a new class
PowWorker
- systematically synchronize all the workers for any task: if a proof is asked, then the answer is not immediately returned to the program but will first wait for other workers to have cancelled their own proof. If a cancel is triggered, same idea: the code will wait all the workers to have cancelled before returning
null
as proof result.
I hope you will enjoy this new version, I also tested it on Ğ1 network with more than 10 blocks long. It just works.
Happy testing :)
Merge request reports
Activity
added 1 commit
- 3bb0aa73 - [fix] #1234 (closed) Synchronize workers on each PoW task
I managed to reproduce the bug after a few hours :
I checked with a
git log
that I really had the last commit !2017-12-24T01:16:50+01:00 - info: ⬇ Cn7YZYRACKczHjyqD376q5uBFF66kL1gbL3psjrmt3dh IN 2017-12-24T01:16:50+01:00 - info: ⬇ HJU98K3jH1D8XvPrBiotgCEnUB9yShPFyn6GSfq28txV IN 2017-12-24T01:16:50+01:00 - warn: httpCode=400, ucode=2007, message=Already received membership 2017-12-24T01:16:50+01:00 - warn: httpCode=400, ucode=2007, message=Already received membership 2017-12-24T01:16:57+01:00 - info: ✔ PEER 4GX5gUFw 2017-12-24T01:16:57+01:00 - info: ✔ PEER 2sZF6j2P 2017-12-24T01:16:57+01:00 - info: ✔ PEER Do99s6wQ 2017-12-24T01:17:02+01:00 - info: Matched 3 zeros 00025B7C985671DADA194862CA792DF021D1CB49175BE1B37CC832A1BFC9F21B with Nonce = 20100004654363 for block#80237 by D9D2za 2017-12-24T01:17:13+01:00 - info: WS2PTOR: Could not connect to peer 2ZvEsd6s using `WS2PTOR txeyf3zs65sq2g3v.onion 80: WS2P connection timeout` 2017-12-24T01:17:13+01:00 - info: Matched 3 zeros 000297F0923C48D6719EC5040C56C349EC01BDC101D08123F775B15869EF6741 with Nonce = 20100004657583 for block#80237 by D9D2za 2017-12-24T01:17:13+01:00 - warn: WS2P: cannot connect to incoming WebSocket connection: WS2P connection timeout 2017-12-24T01:17:15+01:00 - info: Matched 3 zeros 0009A4C5E2A238A859BB05FFD65B094CE050D1656A91B7301152563348488126 with Nonce = 20100004658420 for block#80237 by D9D2za 2017-12-24T01:17:24+01:00 - info: Matched 3 zeros 00030EB6A30F0A907B2A88F9752C9B0190C9E2337C3D6A1821E1119406ED5011 with Nonce = 20100004661266 for block#80237 by D9D2za 2017-12-24T01:17:37+01:00 - info: Matched 4 zeros 00008C312E697127CD516C0FC0B6955CBC2C2F1455CB2444F19690E2730A682A with Nonce = 20100004665596 for block#80237 by D9D2za 2017-12-24T01:17:42+01:00 - info: Matched 4 zeros 00009558071CC52884ED8EFF2225B98846F8E1F36BDCF06628A042793E14024A with Nonce = 20100004666949 for block#80237 by D9D2za 2017-12-24T01:17:42+01:00 - info: ✔ PEER 32jZNQLK 2017-12-24T01:17:42+01:00 - info: Matched 3 zeros 00053CF53941A399591B97AF865095347A4F5ECBD7330BB2863604F723152005 with Nonce = 20100004667051 for block#80237 by D9D2za 2017-12-24T01:17:43+01:00 - info: WS2PTOR: Could not connect to peer 4tAz49Vt using `WS2PTOR x5mlxikgc6dazen4.onion 20901: WS2P connection timeout` 2017-12-24T01:17:43+01:00 - info: Matched 3 zeros 0003D24BABF77C34E90FED5F6B43EEC205E9DDF0F4C25ACCD9B70E9BE0323E90 with Nonce = 20100004667256 for block#80237 by D9D2za 2017-12-24T01:17:44+01:00 - error: WS2P >>> >>> WS ERROR: INCORRECT_PUBKEY_FOR_REMOTE 2017-12-24T01:17:44+01:00 - error: WS2P >>> >>> WS ERROR: INCORRECT_PUBKEY_FOR_REMOTE 2017-12-24T01:17:49+01:00 - info: Matched 3 zeros 000C1D90196FEE12F0BC9C4EC9E2D073BF56C3F423E96D836DD0AF0593DA0782 with Nonce = 20100004669223 for block#80237 by D9D2za 2017-12-24T01:17:57+01:00 - info: Matched 3 zeros 000F160AB82D96CB356DDE2201FC9548D801EA90BCD95AF037DA47AB021246BD with Nonce = 20100004672051 for block#80237 by D9D2za 2017-12-24T01:18:00+01:00 - info: ✔ PEER 5SwfQubS 2017-12-24T01:18:10+01:00 - info: ✔ PEER 8g7unwbN 2017-12-24T01:18:13+01:00 - info: WS2PTOR: Could not connect to peer J8aAWyZE using `WS2PTOR 3k2zovlpihbt3j3g.onion 20901: WS2P connection timeout` 2017-12-24T01:18:28+01:00 - info: WS2P: Could not connect to peer 5gJYnQp8 using `WS2P g1.aerisryzdvrx7teq.onion 53012: WS2P connection timeout` 2017-12-24T01:18:29+01:00 - info: Matched 3 zeros 0003AA84012F26E16EFAB317F49285B24F70CDC04AF8C1E6D5ADAABCEA4DE1A1 with Nonce = 20100004682380 for block#80237 by D9D2za 2017-12-24T01:18:38+01:00 - info: ✔ PEER DfjVNNn7 2017-12-24T01:18:48+01:00 - info: Matched 3 zeros 0005ED806127EE554BA993520EDED98870028D76B51A401E06D8C62F44D9CE80 with Nonce = 20100004688192 for block#80237 by D9D2za
Edited by ÉloïsI don't see a bug in these logs. What do you see as wrong?
Also, to be sure that you do have the correct sources, could you check that you find some
[done]
text in your logs? If you don't have it, probablyyarn
wasn't launched after the checkout.If you can find logs with
[done]
, however, could you show me logs demonstrating that the bug is still here? I will also try to have longer tests on my side.@librelois I would like to known what did you exactly reproduce, because I don't.
I did find a bug, but not the one you mention.
added 1 commit
- 155c98a7 - [fix] #1234 (closed) Need at least 1ms for PoW pauses
Anyway, on my side I found the following bug: when a PoW cancel is asked, the cancellation may not be performed. This was due to the fact no room exists in the async cycle to acknowledge the cancellation.
I've just modified this by changing the value from 0ms to 1ms of waiting, which should be enough.
The definitive fix would be shared memory mechanism of NodeJS 10, as said in #689 (closed). So right now, even if I hope this 0ms to 1ms fix to be enough to have a correct behavior, we still can't have a definitive fix about PoW because it is still CPU dependant.
mentioned in merge request !1214 (closed)
Here is a dump of the last 17 hours running of my personnal node: dump_1234.txt
This log was produced using:
tail ~/.config/duniter/g1/duniter.log -n 300000 | grep "\[done\]\|zeros\|added\|resolution\|cancelled" > dump_1234.txt
The PoW perfectly works for me.
@moul, @librelois, I will wait until this week-end and then will merge unless any of us reproduce a bug.
@c-geek ok I have never reproduced since, it must have been related to my environment, I'm ok to merge :)
mentioned in commit d7988cc2