Different intervals shouldn’t influence on the crawler results. Crawler interval says mainly how often main loop of the application is executed and how often it should scan both known and not known network searching for nodes to connect to. Basically, new peers are crawled instantly, while those ones which has been crawled in the past, would be crawled again after some time.
Different results on how many Zebra nodes are found can be caused by some other reasons. First reason can be connected with the way that peerlists are returned. Vast majority of nodes return 1k addresses but they know it much more - lists are randomized and during single run you can find different set of nodes. However, running crawler long enough should cause that the same peers would be visited more than once, allowing to get more peers. The second reason can be connected with Zebra behaviour we’ve observed - there are some time slots when Zebra stops responding for incoming connections, especially under heavy load. If crawler hits such slot it won’t get connected and won’t add that Zebra to the list of known nodes. Moreover, if Zebra node would be added initially to the known network and won’t respond when crawler will try to connect to it again, it won’t be reported in the final JSON file. We can discuss this problem on Discord.
2 Likes