Sunday, June 7, 2020

Computing arithmetic gymnastics

Today was a day well spent on programming a bruteforce solution to a simple arithmetic puzzle. The puzzle is something I came up with during a daydream a long time ago. It’s simple arithmetic gymnastics for when you have nothing better to do.

You start off with a 4 digit number, let’s take 1234 as an example. The goal is to create a sum out of the separate digits with any of the four basic mathematical operations: add, subtract, multiply or divide, where the answer to this sum must be exactly zero. You may switch the order of any digit, and add parentheses everywhere.

Correct examples in this case would be:

1 + 2 + 3 + 4 = 10
1 - 2 + 3 - 4 = -2
(1 + 3 - 4) * 2 = 0

Where the last example would satisfy the condition and create a solution in this case.

I found that every number I tried had a solution, and I could not find a number that was not solvable. However the question remained: is every 4 digit number solvable? So today I set out to find the answer.

Programming out the problem

When we calculate a solution to this problem, we often use heuristics to quickly find obvious solutions. The most obvious strategy is that when a number contains a zero, you can just multiply all numbers and end with a zero. Another simple strategy is found when you have two digits that are the same, subtracting the two will create your zero and hence the solution again.

Before arriving at heuristics to speed up computation I wanted to solve this problem the hard way: brute forcing a solution. Since computers can do way more instructions per second than I do solving simple arithmetics, this should be easy to solve for a problem this size. I did not care for performance at first, the simplest solution would do. The language of choice was C# for its ease of use.

The problem at hand is easy to imagine, since we are very used to calculating these numbers daily. It is however harder to program out all possible solutions. The method I used to arrive at a solution was to use some combinatorics.

Part one: calculating all possible number combinations

The first observation I made was that some of the operations were not commutative: subtraction and division produce different results when you swap the digits around. For this reason we would need all permutations of the numbers instead of combinations, since the order in which numbers appear does matter!

Creating a recursive function to generate all possible digit permutations looks as follows:

private static IEnumerable<int[]> CalculatePermutations(IEnumerable<int> list, int length)
{
    if (length == 1)
        return list.Select(t => new [] { t });

    return CalculatePermutations(list, length - 1)
        .SelectMany(x => list.Where(element => !x.Contains(element)), (t1, t2) => t1.Concat(new [] { t2 })
        .ToArray());
}

I create all permutations up to a given length, which is useful if we want to solve the problem with more or less than 4 digits later on. Note that this function is used to create permutations of unique indices and not the numbers themselves. This has two advantages: we can move the computation out of the for loop for every number, and we don’t have to deal with duplicate number permutations.

Part two: all possible operator combinations

Similar to the digits we will have to fill N-1 operators between the digits with any of the arithmetics. However in this case we need permutations with repetitions, as the multiply operator could be used multiple times for example.

We can realize this by creating the same function as previously mentioned with a small change: we don’t filter results that have duplicates:

private static IEnumerable<T[]> CalculatePermutationsWithRepetition<T>(IEnumerable<T> list, int length)
{
    if (length == 1)
        return list.Select(t => new [] { t });

    return CalculatePermutationsWithRepetition(list, length - 1)
        .SelectMany(x => list, (t1, t2) => t1.Concat(new [] { t2 })
        .ToArray());
}

Stringing it all together

Now that we have all possible number and operator permutations we can start adding them together. The sum will be solved from left to right, simply accumulating the result during the process. Note that all numbers will have to be floating point numbers since we involve a division operator that can lead to fractions.

But what about the parentheses? Since we have all possible permutations for every number and for every operator, we generate all possible orders. However this does not include all possible parentheses, it's easy to see that the combination:

1 / (2 + 3 + 4) 

is never included with the possible orders since we only calculate the sum from left to right. I have decided to leave out non-commutative solutions as this would introduce another set of permutations, and thus increasing the already factorial runtime even more.

The findings

I ran the simulation for multiple solution spaces, 1 digit numbers up to 7 digit numbers. For each of these ranges I calculated the amount of numbers that had a solution and converted those to percentages as seen in the graph below.


The 0-10 digit range is easy to explain, one zero equals 10 percent. The two digit realm doubles these numbers, as you have numbers containing zeroes (10, 20, etc) and one other combination of similar digits per each ten numbers (11, 22, etc) adds up to 20 percent. I expected the three digit numbers to be very hard to solve, as it’s quite easy to come up with numbers that can’t be solved. I never thought that it would practically be little more than a coinflip whether or not a random number could be solved!

The four digit range will finally answer my burning question of the ages: 97.3 percent of all numbers in this range had an answer! That explains why I never found a number that could not be solved. Basically one in every 40 four digit numbers you come up with should be impossible to solve, yet I never encountered one thus far.

As I initially expected, if you have more numbers to work with the puzzle should get more probable to have a solution. It probably isn’t easier to solve, but the solution is out there!

While the graph above has a neat explanation for each range, the numbers are a bit incorrect, here’s why: in the range of 10-100 we consider 90 numbers, some of which are double and some of which are not. For example 35 and the other permutation 53 both appear, but this is not the case for 60, since “06” was a single digit number. The graph below shows the percentages if we take this minor difference into account:


Digging deeper

Of course I didn’t stop here. While I was implementing this solution I started out with numbers that were only integers to test. Using only integers, I could not use divisions. I wondered if there were any numbers that only had a solution with one or more divisions.

This resulted in arguably the most interesting result from the whole experiment. The exact count of numbers that only have a solution using one or more divisions in the 4 digit range: 24. As I soon realized this is equal to the amount of permutations of a 4 digit number!

That means there is exactly one number (in a variety of different orders), and that number is 3468. I took the moment to appreciate the beauty here and just calculated numerous ways of solving it.

What is it about that 2.4% of numbers that have no solution? Strikingly similar to the above, 2.4% of 10000 equals 240 numbers. Once more we can see that these are permutations of 10 unique numbers. These are the 10 numbers:

1468
1479
1579
3678
3789
4568
4678
4689
5679
5789

Unfortunately I can’t find some black magic that ties all these numbers together yet, but maybe you can! Be sure to let me know! Below are two data pieces I gathered for all of these numbers: their sum, and the permutation of digits and operators that came the closest to zero.



The final thing I calculated is what would change if the solution was a different number than zero:
Which looks like something close to a skewed normal distribution. Interesting solutions were -24 with just over 50% and 72 with just under 50% of all numbers generating a solution.

Conclusion

Thanks for taking the time to read about this overanalyzed strange number game. I hope you still learned a thing or two from this futile effort to promote my nonsense arithmetics. If you enjoy me getting enthusiastic about absolutely nothing, feel free to bookmark this page.

Wednesday, August 21, 2019

Ryne

I just released a tech demo of Ryne, my game-engine-to-be. You can find more information and media on the Ryne Engine website.

Ryne is an experimental voxel pathtracer. You can try the tech demo yourself to see if it runs on your PC.

In the future I will release the C# scripting side open source. This will allow you to change most of the engine features yourself. You can see an overview of the functionality currently included in that project here



Why “release” it now?

I don’t want to create endless videos about a new technology that you will have to believe is real. Instead I’d like to include as many of you as possible while Ryne is in development. I know Ryne still has flaws and is nowhere near as usable as I want it to be, but I’m still convinced that putting out tech demos like these will bring the point across that we will get there eventually. This will both help me focus on the most important issues, and make everyone aware this is no longer a dream, but we can actually render something right now.

I’m trying to keep a stream of releases going out whenever I feel a new set of features or bug fixes are good to go.

If you’d like to discuss or have more questions, feel free to join the discord server.

Monday, April 23, 2018

Cryptocurrencies in (p)review

And now for something completely different. Recently a new sector of technology caught my attention: cryptocurrencies. Of course, most of you reading this will know about this piece of technology by reading news headlines on the ridiculous price growth of Bitcoin. In this article I will try to create an overview of the whole crypto space.

All over the internet you can find people either praising bitcoin or completely criticizing its existence. Even in the investing world there are "Bitcoin maximalists", people who only invest in Bitcoin and ignore all other cryptocurrencies. My view on Bitcoin is pretty neutral, there are both advantages and disadvantages to the almost 10 year old cryptocurrency. The largest upside is the fact that everyone knows Bitcoin exists, a miracle on its own. The downside is the (already) dated tech: Bitcoin has run into major problems with transaction speed and transaction costs. A solution is already being released: Lightning Network, but it may be too little too late.

Enough about Bitcoin. There is one other topic I have to address before we can have a good discussion about the technology rather than its price: investing. Just like the opinions on Bitcoin, investing in cryptocurrencies is a hot topic, and will probably spike the news again if prices rally up. The question is always "should I invest in cryptocurrencies?". The only answer to this question has to be: decide for yourself. Before you race to Google, be aware that most, if not all, sources of information contain a certain bias. You could for example visit the cryptocurrency subreddit and find a lot of people are positive on certain currencies. On the other hand I found this article from MrMoneyMustache, even though its negative outlook, pretty informative. Either way, this post is not about investing, but rather the technology behind several cryptocurrencies.

I hope to have retained enough readers after the last paragraph, so below a short statement on why cryptocurrencies are still in its infancy:


The reason why I became so interested in cryptocurrencies has nothing to do with the price of a certain cryptocurrency. I got interested because almost all of the projects are completely open-source! This open-source state is such an interesting shift compared to all currently closed source companies. There is even a page that monitors activity on all the GitHub crypto repositories.

Other than the code being free to view for everyone, I was astounded by the creativity of some projects. Some of the interesting projects that I encountered are listed below (hint: this is where you can find my bias)

IOTA: An Internet Of Things solution for the crypto world, using what they call the "Tangle", a technology placed in the category of "Blockchain 3.0". The IOTA team tries to solve the original peer-to-peer transaction problem with a revolutionary but simple method: you as a user of the network will have to verify two other transactions on the network when you enter a new transaction. Their paper is very in depth on safety and even discusses resistance to quantum computing. Interestingly enough I stumbled upon a good friend of mine in both IOTA and NANO: DAG (Directed Acyclic Graphs), which I happen to use for compressing Sparse Voxel Octrees (see other blogposts ;) ).

Storage solutions on the blockchain. A very interesting idea. You can upload files completely anonymous, and nobody in the world is able to take down your file (except if they happen to have the power to cut off all existing nodes). Notable mentions are Oyster Pearl and SIA which are decentralized solutions for storage being launched.

Battling recent changes in the US on Net Neutrality are projects like Substratum and the recently announced Oyster Shell. These projects aim to create a decentralized web, where you can use the internet by connecting through other users instead of central authorities like internet providers. Unfortunately this is only a software solution as the hardware between users can still be controlled by centralized authorities.

Blockchain application platforms like Ethereum and NEO allow for hosting other cryptocurrencies and dapps (decentralized applications) on their network. Ethereum being the second oldest crypto, is currently hosting 1385(!) projects. Cryptolights is an interesting visualization of some transaction speed comparisons for mainstream cryptocurrencies. Even though it is faster than Bitcoin, Ethereum has also hit its limits when a famous app called CryptoKitties was released.

Privacy coins like Monero and Enigma mask public transactions by scrambling a group of transactions together. Enigma even allows for private data inside "secret contracts", which are able to perform the necessary computation without knowing the complete dataset.

There are too many awesome projects to list here, I left out some other interesting projects like to keep this article reasonably short. You can always check CoinMarketCap yourself and find more projects.

So what now?
We have seen the power of blockchain and the positive sides of the crypto world, so we have to conduct a reality check and look at the "dark side of crypto". The most important problem that urgently requires solving is the power consumption. This page shows bitcoins power consumption:


The power consumption at the moment of typing is comparable to all of Switzerland. One single transaction can power about 32 US households for a day. On the other hand, this article compares the current banks and VISA's power consumption versus Bitcoin. It shows a rough estimate of all power consumption by the current banking system worldwide, which is of course more than bitcoin, but banks won't be completely replaced by Bitcoin. There will still be buildings and computers that need to be powered, even though the work on them will probably change.

The problem mainly exists because of the current financial incentive generated by what is called mining in the crypto world (also known as Proof Of Work). Users offer their computational power to compute the solution to a cryptographic problem for the next transaction (where many gamers expressed outrage, for whom graphics cards became an expensive purchase). If only there was another solution readily available for Byzantine fault tolerance.. Luckily some projects already adapted POS (Proof Of Stake) methodology, where users can stake to be a decision node on the network. Examples like Ethereum's Sharding and NEO's DBFT give me hope for a more eco-friendly solution.

A second problem is found when looking at the current state of the crypto world. There are currently around 1600 currencies in existense, of which most projects will completely fail, or are complete scams. This seems to be comparable to the dot-com bubble, where many internet startups went bankrupt when the market collapsed.

Finally I see the statement "crypto makes money laundering way too easy" a lot. Guess when a similar statement also appeared: when everyone was speculating about the invention called "the internet". I can't say it isn't a problem, but removing all privacy from the crypto space isn't the solution either.

But you promised a discussion
Now that I have overloaded you with information, let's dive even deeper. Why does this technology exist? Why did Satoshi Nakamoto (creator of Blockchain and Bitcoin) publish this paper anonymously, and dissapeared? There are a lot of unanswered questions in this space, and I can't wait to see the documentary on the crypto space a decade from now.

Would you say that this technology is predetermined? This article lists a good reason why such an evolution was a logical step (using technology to lower the economic cost). After all, the technology was already predicted in 1999.

Put yourself in Satoshi Nakamoto's place for a moment. Would you go public? I think he has enough reason to stay anonymous. Going public might make him responsible for oppressing the government or banking systems. On the upside, you have a lot of Bitcoin..

How about measuring what such a network is actually worth: this article builds on a previous article of the Network Value to Transactions (NVT) ratio. The NVT ratio is used as the "crypto PE ratio" in order to compare crypto "stocks" in a similar fashion as stocks. The NVT ratio is based on Metcalfe's law, a measure of the network in terms of its users. While it's reasonably accurate, it would be interesting to see a ratio that also measures transactions that happen off-chain, for example exchange transactions.



Whether or not any of the currently existing currencies survive, I do think the technology will trigger a global shift in how we deal with data. Blockchain technology (and all of its successors) have excellent usecases for dealing with personal data. We could use the technology to make the data immutable (meaning no information tampering is possibe) and more private. Maybe the dystopian fiction of having a small chip underneath your fingernails containing your private key will even become reality..

Finally, my hopes for the future are to become less dependent on central organizations managing all the world's currency and make a shift towards a decentralized solution, while we keep dreaming about a distributed network for efficiency. With this overview article I hope to have sparked your interest in this network of "trust based on distrust".

Did I miss something important, or do you want to discuss? Let me know in the comments.

Sources:
https://en.wikipedia.org/wiki/Lightning_Network
https://trends.google.com/trends/explore?q=bitcoin
https://blockchain.info/nl/charts/avg-confirmation-time
https://bitinfocharts.com/comparison/bitcoin-transactionfees.html
https://www.reddit.com/r/CryptoCurrency/
https://www.mrmoneymustache.com/2018/01/02/why-bitcoin-is-stupid/
https://coincheckup.com/analysis/github
https://www.iota.org/
https://nano.org/en
https://oysterprotocol.com/
https://sia.tech/
https://substratum.net/
https://www.ethereum.org/
https://neo.org/
https://getmonero.org/
https://enigma.co/
https://www.coindesk.com/cat-fight-ethereum-users-clash-cryptokitties-congestion/
https://coinmarketcap.com/
https://hackernoon.com/the-bitcoin-vs-visa-electricity-consumption-fallacy-8cf194987a50
https://en.wikipedia.org/wiki/Proof-of-work_system
https://en.wikipedia.org/wiki/Byzantine_fault_tolerance
https://en.wikipedia.org/wiki/Proof-of-stake
https://github.com/ethereum/wiki/wiki/Sharding-FAQ
https://steemit.com/neo/@basiccrypto/neo-s-consensus-protocol-how-delegated-byzantine-fault-tolerance-works
https://steemit.com/blockchain/@skane/a-vision-of-a-hivemind
https://medium.com/cryptolab/https-medium-com-kalichkin-rethinking-nvt-ratio-2cf810df0ab0
https://en.wikipedia.org/wiki/Metcalfe%27s_law
https://medium.com/@bbc4468/centralized-vs-decentralized-vs-distributed-41d92d463868

Thursday, July 6, 2017

Automated marshalling of managed- to unmanaged structures

An art that I wouldn't even wish upon my greatest enemies to figure out.

I recently started a new project in a language very familiar to me: C#, a managed language. This means that all the memory management is done for you, which is both a blessing and a curse. For this project, I happened to struggle over the "curse part" of automatic memory management.

The problem I encountered is as follows: when you have two structures that are identical in terms of their members, their respective sizes can (and often will) differ in managed languages compared to unmanaged languages. For example, I have the following structure in C# and C++ respectively:

// C#
[StructLayout(LayoutKind.Sequential)]
struct DebugComponent
{
    public float4 Float4;
    public float Float;
}

// C++
struct CPP_DebugComponent
{
    float4 Float4;
    float Float;
};

The size of the structure can be found in C# by using Marshal.SizeOf() (or sizeof() in unsafe code) and reports that the structure is 20 bytes in size, which is correct. Note that I already applied the StructLayout to Sequential, as this will create a layout similar to unmanaged code.

The size of the same structure in C++ using sizeof() reports that the structure is 32 bytes. This is also correct, because the float4 type here is aligned to 16 bytes, meaning the structure will receive another 12 bytes of padding at the end, to make sure it aligns with 16 bytes.

Unfortunately trying to use this structure in a tool such as ManagedCuda, the CUDA kernel struct will use the C++ version, and when you call the kernel from your C# code you will have to use the other version. This creates a mismatch in memory layout, resulting in very weird artifacts after running the kernel, or even crashing because you're writing to unallocated memory in this case.

The "simple" solution I found is to manually expand the C# structure by using the StructLayout.Size attribute to extend the structure to 32 bytes instead of 20. After asking my question on StackOverflow, I didn't solve the problem to create these structures automatically without counting the sizes of every individual type in the structure itself.

So I had to switch up my solution a little bit. I created a project which contains the raw C# structures that I want to use, along with all their functionality like loading and serialization. I then created another C# project for the automated code generation. Using this project we can load our Structures as a dll, from which we can derive all the structures and what types they contain in text templates:
  • Structures: project that contains raw structures that will be used on the GPU
  • Tools: my general purpose project that will generate GPU versions of the structs defined in Structures.
In order to make this conversion as secure as possible I don't want to manually check every time I create a new structure if the GPU version has the same alignment and size. So I created two more projects:
  • AlignedStructsWrapper: A C++/CLI project that combines managed and unmanaged code
  • Tests: a unit test project
Using the CLI project, we can load both our versions of the structure: the managed C# version and the unmanaged C++ version. We can now measure the difference in their sizes:

public ref struct WrapperGpuDebugComponent
{
public:
 int SizeDiff()
 {
  int managedSize = sizeof(Tools::Content::Generated::DebugComponent);
  int nativeSize = sizeof(CUDA::CPP_DebugComponent);
  return managedSize - nativeSize;
 }
};

In the unit test project we load our CLI from reference and we can create a simple unit test that calls the SizeDiff function and checks if the difference is indeed 0:

[TestMethod]
public void CheckStructureSizes()
{
 WrapperGpuDebugComponent debugcomponent = new WrapperGpuDebugComponent();
 Assert.AreEqual(debugcomponent.SizeDiff(), 0);
}

Of course I also generated the CLI structures and the unit test functions automatically for every structure so I only have to recompile the projects and have everything tested.

Sunday, May 14, 2017

CUDA in Visual Studio 2017

Edit: CUDA 9.0 RC is released. This version shows full Visual Studio 2017 support.

Note: this article only shows how to compile Visual Studio 2015 CUDA projects in Visual Studio 2017. For actual VS2017 support we will have to wait for a new CUDA release.

I previously wrote a small article on CUDA support for VS2015, to support CUDA compilation of older projects. Following the same principle we can 'hack' CUDA compilation support in VS2017. 

What you need
  • CUDA installation with visual studio integration for VS. I used CUDA 8.0 and VS2015 respectively.
  • VS2017 (any edition)
Copying the required files
  • To allow CUDA compilation we have to copy a few files. Find the CUDA 8.0 setting files in the VS2015 buildcustomizations directory:
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\BuildCustomizations
Note: If you use a different VS version, you have to change the 'V140' accordingly (V120 for VS2013 for example).
  • Copy the following files: CUDA 8.0.props, CUDA 8.0.targets, CUDA 8.0.xml, and Nvda.Build.CudaTasks.v8.0.dll
  • Find the VS2017 buildcustomizations directory:
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\VC\VCTargets\BuildCustomizations
Note: I used the VS2017 Community edition. If you have another edition, change 'Community' in the path accordingly.

  • Paste the CUDA files here.

That's it
You can now load and compile your VS2015 CUDA projects in VS2017. When you first open your project in VS2017, make sure to not upgrade your project to VS2017, otherwise this won't work. 

Friday, April 28, 2017

IniGenerator

I wrote a simple C# code interface for ini files. There are already great NuGet packages for ini file IO parsing and writing, but not a lot of packages that automatically generate a code layer. For this project I based my solution on the ini parser to do the file IO.

My package only provides a small overlay to create a C# class which handles all the file IO behind the scenes. Using the text-templates, we define both the ini file and create a C# class. An example file I used to create my configuration:

<#@ include file="$(ProjectDir)IniTemplate.tt" #>
<#
    // All properties in the ini file
    // Name, default value and category
    CreateProperty("Width", 1280, "Video");
    CreateProperty("Height", 720, "Video");
    CreateProperty("Fullscreen", false, "Video");

    // Generate the code layer
    GenerateIniClass();
#>

Which will create a C# class with the same name as your text-template. The ini file will either be created the first time you use this class, or the old values will be read from the existing file.

Usually there is no backwards compatibility with older versions of the ini file. If you add new properties, all values in the ini file will be reset to their defaults. I avoid these scenarios using the beautiful functionality to merge two ini files from the ini parser. I can simply add the new properties to the old ini file without changing their values.

Finally, an example of the above template used in code:

// Use the namespace where you placed the template
using IniGenerator.Content.Generated;

// Name of ini file
var config = new Config();
// Can be used directly in code without parsing
var size = new Size(config.Width, config.Height);
var fullscreen = config.Fullscreen;

You can view the source code on GitHub, or download the package from NuGet.

If you have any feedback, leave a comment or post an issue on the GitHub project.

Saturday, April 22, 2017

Master Thesis

Update: You can download the full thesis here.

Level-of-Detail Independent Voxel-Based Surface Approximations was the subject of my master thesis. I wrote a small dissemination that explains the basics of my thesis on this page.


This image shows the final result of my thesis work. The models above are voxel models with 4096 (2^12) voxels in every axis. If they were all filled, I would have to store 4096^3 = 68719476736 voxels in total. There has been a lot of research into compressing the huge amount of data this requires, I mentioned some examples on the thesis page.

Using a Sparse Voxel Octree (SVO) storing scalar field values, the six models above can be stored in 12GB of memory total. Using my multiresolution method we can store visually comparable models in only 2GB of memory total.

Here is a small video showing the current state of the voxel path tracer: