Idea : Self-Organizing-Maps

Stage 0 : Color

We may take a live video feed, or several live video feeds, and chop them up into pieces which will be fed into a two dimensional self organizing map for display. Possible pieces include

individual pixel values : the map will appear as a projection of 3 dimensional color space into two dimensions, with each color represented in proportion to its prevalence in the environment.

small neighborhoods : will appear similar to the map generated by individual pixel values, but may also capture variation in the texture of the environment. This will be more computationally expensive than using simple pixel values.

color spectrum profiles in time : We can extract a color spectrum profile, which will be a histogram of the colors present in the environment. This sample will change in time. Individual elements of the resulting map will resemble the complete maps from individual pixel values.

Stage 1 : Sound

Self organizing maps can take a set of input patterns and arrange them in an N dimensional space based on similarity structure.

This can be applied to music.

We can either use the frequency spectrum as input, or sample some longer time duration. We will use this to chop up a sound sample into a set of inputs. These inputs will be fed into an N dimensional self organizing map.

If user input is desired, we will make the map two dimensional. The user control which part of the sound space is played back by selecting a portion of the 2D map. We may use cubic or linear interpolation to generate a smooth variation in sounds.

If no user input is desired, we may random walk around an N dimensional map.

If we wish to make the exhibit change over time, we could make the map 3 or 4 dimensional, and have a randomly drifting 2D plane of user interaction that intersects the higher dimensional space.

Stage 2 : Correlate

In the case where we use color spectrum samples over time, we may correlated visual maps with auditory maps. This will require ongoing input and interaction with the viewers. It may be possible for the network to play back sounds it has heard when it encounters visual input previously correlated with said sounds, or vice versa. I'm not entirely sure how this phase would be achieved computationally.

I don't have the time or resources to construct this at the moment. Some day.


Self-Organizing Maps

I created these visualizations of the output from a neural network algorithm for embedding a complex perceptrual space in two dimensions. It would be very interesting to apply this algorithm to other visual data sets. If applied to auditory patches it could also be a useful generator of electronic music.


edit :

I had a mildly disheartening realization earlier today: this network was only able to create a map with smooth variation between written digits because the training set included ambiguous or intermediate examples. At the moment I can't think of a way to get a smoothly varying map if say, all we have is ten perfect digits. There ought to be a way though.