Execute in the terminal the following command to import the e-Policy's database:
"mysql -u username -p -h localhost DATA-BASE-NAME < epoldump20140210.sql". Replace "username" with the user you created when you installed the mysql database.
Copy the Opinion Mining Software Prototype folder to your desired location and change the files "db.cfg" and "/Problem Config/epolicy/Crawlers.db.cfg" with the appropriate information (Database name, hostname, username and password).
In the terminal, run R and execute the following commands: "library('shiny');runApp('.');". Make sure you are executing R from the main folder of the prototype.
The stand alone prototype includes a user interface that allows two types of users to login (admins and normal users), and provides a different set of functionalities for each of these types of users.
The interface for standard users allows them to select the problem / domain they wish to explore, and then for each domain it provides means for the user to select the topics she/he wants to visualize, as well as the time span to consider in this exploration. After this selection, the system draws a plot of the data in which a user can see the a line with the aggregated sentiment score along time together with a confidence band around the line reflecting the variability in this aggregated score. A second plot is also drawn with the individual sentiment scores assigned to each post, which lead to the aggregated sentiment expressed by the mentioned lines. Figure 1) shows an illustration of this part of the GUI.
Standard users can also drill down to individual posts. A table is presented with all posts within the selected time span (Figure 2), with a search box that allows easy filtering of these posts. The user may also select and individual post (through its ID) to obtain the specific text of this post together with the assigned sentiment scores for each topic.
The stand alone OM prototype also provides and administrator graphical user interface (Figure 3).
In the first section of this interface admins can select the crawlers (providing a command which will be run by the system), select the topics and tune the parameters of the opinion mining models that are used by our prototype. Currently, it is possible to tune the following parameters of these models:
lowercharacters: Possible values are T or F. This defines if in the pre-processing of the text documents all characters should be lowered.
removepunctuation: Possible values are T or F. This defines if in the pre-processing of the text documents all punctuation should be removed.
removenumbers: Possible values are T or F. This defines if in the pre-processing of the text documents all numbers should be removed.
removesparseterms: Value between 0.01 and 1.00. This defines the value that should be used when removing sparse terms on the pre-processing of the text documents.
rntree: Controls the number of trees to grow in the approaches that use random forests. The value should be a positive integer.
svmc: Sets the value associated with the cost of constraints violation in support vector machines, it is the 'C'-constant of the regularization term in the Lagrange formulation. The value should be a positive integer.
svmep: Controls the epsilon in the insensitive-loss function of support vector machines. The value should be a positive integer.
svmg: Parameter used in the kernel of support vector machines. The value should be a positive integer.
nnets: Controls the number of units in the hidden layer of neural networks. The value should be a positive integer. The value should be a positive integer.
nnetd: Controls the weight decay in neural networks. The value should be a positive integer.
A second section of the admin GUI allows these users to train new opinion mining models by clicking on a button and also to inspect and tag new posts (Figure 4). For this latter task we provide a table where the available posts can be filtered by ID, date and title. Using these filtering facilities the user may drill down to a specific post and eventually tag it for sentiment concerning the available topics.