An ID Generator in C I wrote a C program that generates a 16-character ID from: 10 digits (0–9) 26 uppercase letters (A–Z) 26 lowercase letters (a–z) That gives us 62 possible characters for each slot. Since the ID length is 16 characters, the total number of unique possible IDs is: 62 raised to the power of 16 which is approximately 40000000000000000000000000000 (40 octillion) possible combinations. In other words, the probability of generating the same ID twice (at random) is practically 0 for all real-world purposes. Here’s the program:
A couple of issues, you might consider: rand() produces a pseudo-random number in the range [0, RAND_MAX) = [0, 2_147_483_647), which means that there are at most that many strings. time(NULL) has a precision of one second, which means that any two processes started in the same second will generate the same id (and if you provide this as a library, the same sequence of IDs). Even if time had ms or ns precision, there is still a one in a billion chance that two processes started in the same second collide (chances increasing scarily as you add more processes).
Neat. Doing this in a production-grade way is surprisingly subtle and you might be interested to look at the history of the UUID standard. https://en.wikipedia.org/wiki/Universally_unique_identifier
Great! I think it would be more efficient if you used a function to get a random number on each iteration. The function would pick an ascii value that corresponds that of numbers: 48-57, lowercase chars: 97-122 and uppercase: 65-90. You begin the iteration over the generated_ID array and populate with the number returned from each iteration and at the end, you add the '\0' byte (0). Then you return the value using puts(generated_ID). Using a function would beat using arrays to define values as C uses malloc under the hood to allocate memory for arrays (and then copy each value to the memory location in the array) and that overhead cost for this can be avoided. One may argue with this approach for scalability reasons, and if that's the issue, then using this array approach like you did would be good, however, it'll be as easy as adding an extra case in switch statement in your random_value generator. Well done though...many run away from C for some unknown reasons 🤣
The time.h does the trick
Nitpick: ID_len better to have defined as 16. Then you'll have to adjust it in only one place - for denerated_ID declaration. In two other places you'll be able to use it without adjustments. This is irrelevant for performance (the compiler will substitute a constant anyway), but it will make code simpler to reason about: adding 1 for array declaration is easy to understand for anyone who knows C, while adjustments in the loop and for setting \0 (as it is now) require looking into constant declaration to realize that it is 1 byte longer than the declared result length. Just my 2 cents...
If you'd like, message me, i'll give you some further food for thought. 👍
A problem with this code is that it seeds the random generator with the current time in seconds. That means that if you run it twice in one second, you get the same id. Also, if you approximately know when the program was run, you can generate the likely ids that it generated. On Linux, prefer /dev/random for better quality random numbers. Also, read or watch: https://www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/heninger about the problem of generating good quality random keys in devices with low amounts of entropy.
Couple of mathematical gotchas to watch out for: 1. Birthday Paradox (https://en.wikipedia.org/wiki/Birthday_problem#Generalizations): for N possibilities, you have a 50% of generating a collision after only generating sqrt(2N). So you'll likely start having collisions after a couple hundred trillion generations, not one 40 octillion. Still in the realm of implausable but worth knowing about. 2. rand() % n is only evenly distributed if n is coprime with RAND_MAX. Neither of these affect your use case, but they show that the concerns we have get more convoluted as we use our tools to do more stuff
You could also check where the rand gets its seed from.
Looks so cool! Congrats!