As part of our continuing effort to digitise poetry and to automate new forms of poetry, we released an R package called pattern.nlp, which is available at https://github.com/bnosac/pattern.nlp . It allows R users to do sentiment analysis and Parts of Speech tagging for text written in Dutch, French, English, German, Spanish or Italian. Of course this can also be used for other purposes like data preparation as part of a topic modelling flow.
If you are interested in text mining, feel free to register for the text mining courses listed at our last blog post.
If you just want to do sentiment analysis and POS tagging in these 5 European languages, go ahead as follows. Sentiment analysis is available for Dutch, French & English.
## Sentiment analysis
x <- pattern_sentiment("i really really hate iphones", language = "english")
y <- pattern_sentiment("de wereld is een mooie plaats, nietwaar sherlock", language = "dutch")
z <- pattern_sentiment("j'aime Paris, c'est super", language = "french")
rbind(x, y, z)
polarity subjectivity id
-0.80 0.90 i really really hate iphones
0.70 1.00 de wereld is een mooie plaats, nietwaar sherlock
0.65 0.75 j'aime Paris, c'est super
Parts of Speech tagging is available for Dutch, French, English, Spanish & Italian.
x <- "Il pleure dans mon coeur comme il pleut sur la ville. Quelle est cette langueur qui penetre mon coeur?"
pattern_pos(x = x, language = 'french')
x <- "Avevamo vegliato tutta la notte - i miei amici ed io sotto lampade
di moschea dalle cupole di ottone traforato, stellate come le nostre anime,
perché come queste irradiate dal chiuso fulgòre di un cuore elettrico."
pattern_pos(x = x, language = 'italian')
We are also working on a Dutch wordnet - which will be fully released in due date. More information at https://github.com/weRbelgium/wordnet.dutch.
Hope you use the package for spreading new languages!
Part of the R course offering of BNOSAC which you can find at http://bnosac.be/images/activities/bnosac_courses_r.pdf, we offer several 2-day hands-on courses covering the use of text mining tools for the purpose of data analysis. It covers basic text handling, natural language engineering and statistical modelling on top of textual data.
Interested in upgrading your skills on text mining with R? Registering can be done for the following days.
2016: October 24-25: subscribe at https://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r
2016: November 14-15: subscribe at http://di-academy.com/event/text-mining-with-r/
2017: March 23-24: subscribe at https://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r
The following elements are covered in this course.
1. Import of (structured) text data with focus on text encodings. Detection of language
2. Cleaning of text data, regular expressions
3. String distances
4. Graphical displays of text data
5. Natural language processing: stemming, parts-of-speech (POS) tagging, tokenization, lemmatisation, entity recognition
6. Sentiment analysis
7. Statistical topic detection modelling and visualisation (latent dirichlet allocation)
8. Automatic classification using predictive modelling based on text data
9. Visualisation of correlations & topics
10. Word embeddings
11. Document similarities & Text alignment
Hope to see you there.
Within 2 weeks, our 2-day crash course on Applied spatial modelling with R (April 13-14, 2016) will be given at the University of Leuven, Belgium: https://lstat.kuleuven.be/training/applied-spatial-modelling-with-r
You'll learn during this course the following elements:
The sp package to handle spatial data (spatial points, lines, polygons, spatial data frames)
Importing spatial data and setting the spatial projection
Plotting spatial data on static and interactive maps
Adding graphical components to spatial maps
Manipulation of geospatial data, geocoding, distances, …
Density estimation, kriging and spatial point pattern analysis
More information: https://lstat.kuleuven.be/training/applied-spatial-modelling-with-r. Registration can be done at https://lstat.kuleuven.be/forms/courses
With the release of RStudio add-in possibilities, a new area of productivity increase and expected new features for R users has arrived. Thanks to the help of Oliver who has written an RStudio add-in on top of taskscheduleR, scheduling and automating an R script from RStudio is now exactly one click away if you are working on Windows.
How? Just install these R packages and you have the add-in ready at the add-in tab in your RStudio session. Select your R script and schedule it to run any time you want. Hope this saves you some day-to-day time and feel free to help make additional improvements. More information: https://github.com/jwijffels/taskscheduleR.
install.packages("taskscheduleR", repos = "http://www.datatailor.be/rcube", type = "source")
If you are working on a Windows computer and want to schedule your R scripts while you are off running, sleeping or having a coffee break, the taskscheduleR package might be what you are looking for.
The taskscheduleR R package is available at https://github.com/jwijffels/taskscheduleR and it allows R users to do the following:
i) Get the list of scheduled tasks
ii) Remove a task
iii) Add a task
- A task is basically a script with R code which is run through Rscript
- You can schedule tasks 'ONCE', 'MONTHLY', 'WEEKLY', 'DAILY', 'HOURLY', 'MINUTE', 'ONLOGON', 'ONIDLE'
- After the script has run, you can check the log which can be found at the same folder as the R script. It contains the stdout & stderr of the Rscript.
myscript <- system.file("extdata", "helloworld.R", package = "taskscheduleR")
## run script once within 62 seconds
taskscheduler_create(taskname = "myfancyscript", rscript = myscript,
schedule = "ONCE", starttime = format(Sys.time() + 62, "%H:%M"))
## run script every day at 09:10
taskscheduler_create(taskname = "myfancyscriptdaily", rscript = myscript,
schedule = "DAILY", starttime = "09:10")
## delete the tasks
taskscheduler_delete(taskname = "myfancyscript")
taskscheduler_delete(taskname = "myfancyscriptdaily")
- When the task has run, you can look at the log which contains everything from stdout and stderr. The log file is located at the directory where the R script is located.
## log file is at the place where the helloworld.R script was located
system.file("extdata", "helloworld.log", package = "taskscheduleR")
Who wants to set up an RStudio add-in for this?