I also demonstrate that as.Date does indeed work when used consistently (actually it is probably a better fit for your data than my earlier example). I update the example to demonstrate aligning the labels and setting limits on the plot. My attempt based on the accepted answer was like so ( result here). The distribution is sort of correct but there are odd breaks. It looked like it was overlaying the same label text over and over so the letters looked kind of odd. I tried treating my Date vector as continuous and don't think it worked so well. This SO question is playing with breaks and labels.The simple format= option did not work for me. r-bloggers has a post on this, but it appears outdated.Another learnr post recreates a time series in ggplot2, but wasn't really applicable to my situation.It stated that I needed to get my data into POSIXct format, which I now think is false and wasted my time. Started here at learnr.wordpress, a popular R blog.After counting bins and reading along the month labels, for the life of me I can't figure out which plot has an extra or is missing a bin of the histogram!Īny thoughts on the differences here? edgester's method of creating a separate countĪs an aside, here are other locations that have information about dates and ggplot2 for passers-by looking for help: For some reason gauden's plot starts in 2008-Mar and still somehow manages to end at 2012-May. This is correct based on a minimum value in the data of and a max date of. edgester's plot starts at 2008-Apr and ends at 2012-May.gaps in gauden's plot for 2009-Dec and 2010-Mar table(dates$Date) reveals that there are 19 instances of and 26 instances of in the data.Ggplot(dates, aes(x=Date)) + geom_histogram(binwidth=30, colour="white") + Here is my attempt based on gauden's answer: dates$Date <- as.Date(dates$Date) Theme_bw() + opts( = theme_text(angle=90)) Ylab("Frequency") + xlab("Year and Month") + Scale_x_date(breaks="1 month", labels=date_format("%Y-%b"), Ggplot(freqs, aes(x=names, y=x)) + geom_bar(stat="identity") + Note the differences between the two answers' resulting graphs after the code.īased on answer below, I was able to do the following: freqs <- aggregate(dates$Date, by=list(dates$Date), FUN=length)įreqs$names <- as.Date(freqs$Group.1, format="%Y-%m-%d") I initially thought gauden's answer helped me solve my problem, but am now puzzled after looking more closely. Updates based on answers from edgester and gauden I don't understand why the histogram is different. I worked through the example in the ggplot2 documentation at the scale_x_date section and geom_line() appears to break, label, and center ticks correctly when I use it with my same x-axis data. Tick marks don't appear centered under bars.The frequency distribution has changed shape (binwidth issue?).Stat_bin: binwidth defaulted to range/30. I wanted %Y-%b formatting, though, so I hunted around and tried the following, based on this SO: ggplot(dates, aes(x=converted)) + geom_histogram() Ggplot(dates, aes(x=converted)) + geom_histogram() I've created several columns as I wasn't sure the best way to do this: > dates head(dates)ĭates$converted <- as.Date(dates$Date, format="%Y-%m-%d") I've uploaded my data to pastebin to make this reproducible. Appropriate limits minimized empty space between edge of grid space and outermost bars.Tick marks centered under the matching bars.A histogram of the frequency of my dates.I'm having issues with understanding why the handling of dates, labels and breaks is not working as I would have expected in R when trying to make a histogram with ggplot2.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |