![]() In this example, there are four different levels we could observe: CREATE, READ, UPDATE and DELETE. This is the result: Plotting LevelsĪ common task is to visualise categories or levels of measurement data. As you can see at the definition of the color scale, we assign the actual color to this level there. This level can be thought of as a category, which ensures that all the points which belong to the same category have the same color. Note that the color nme “Storage 1” for instance of course does not denote a color, but it assignes a level for all points of the graph. Name="Measurements", labels=c("System 1", "System 2")) Scale_color_manual(values=c("Storage 1"="forestgreen", "Storage 2"="aquamarine"), ![]() Geom_point(aes(x=id,y=storage2,color="Storage 2")) + Geom_point(aes(x=id,y=storage1,color="Storage 1")) + Plotting two graphs of thecolumns storage1 and storage2 is straight forward. As you can see, we have a list of operations of the four types CREATE, READ, UPDATE and DELETE and a measurement value for the storage demand in both systems. We created a random data set simulating the characteristics of system measurement data. Operations = sample(operationTypes,N,replace=TRUE) OperationTypes = c('CREATE','READ','UPDATE','DELETE') # Define the operations availabel and draw a random sample Storage2 = sort(sample(1:100000, size = N, replace = TRUE),decreasing = FALSE) Storage1 =sort(sample(1:100000, size = N, replace = TRUE),decreasing = FALSE) ![]() # Generate a random, increasing sequence of integers that we assume is the storage demand in some unit # Set seed to get the same random numbers for this example Evaluation DataĪs an example,we plan to evaluate the storage demand of two different systems and compare the results. Install the following packages via apt sudo apt-get install r-base r-recommended r-cran-ggplot2Īnd RStudio by downloading the deb – File from the project homepage. Things which have the same meaning in both graphs should visualized in the same way, which requires a little hack. What was more tricky is to provide for each graph a defined set of colours, which can be mapped to each instance of the variable. Now for visualizing which of these operations has the most effects on the system, I needed to colourise each operation within one graph. Each system runs a set of operations, think of create, read, update and delete operations (CRUD). My goal was to create a plot for each non-functional property, the execution time and the storage demand, while each plot should depict both systems’ performance. As almost all techniques, R and ggplot2 require practise and training, which I realised again today when I spent quite a bit of time struggling with getting a simple plot right.Ĭurrently I am evaluating two systems I developed and I needed to visualize their storage and execution time demands in comparison. In this blog post, I demonstrate how to plot time series data and use colours to highlight a specific aspect of data. Also with RStudio, there exists a convenient IDE which provides useful features for data scientists. The reason why I decide to stick with R is its popularity and flexibility, which is still impressive. A more classical work horse for data science is the R project and its plotting engine ggplot2. Recently a lot of new Javascript based frameworks have gained quite some momentum, which can be used in Web applications and apps. There exists a variety of different approaches for visualising data. Integrating a meaningful graph into a paper or your thesis could improve readability and understandability more than any formulas or extended textual descriptions can. Data visualization is a powerful tool for communicating results and recently receives more and more attention due to the hype of data science.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |