Gnome Outreach Program for Women

My experience interning with Gnome's OPW

Category: Uncategorized

Scraping Meta Information from URLs

Last week I completed the following items for readlater app thanks to generous help from brantje`:

  • Delete an item in the readlater app
  • Scrape meta info from a given URL
  • Fix the search function

For the delete function, I ran into the issue where the click event for sidebar items is not being caught in the JS. Per brantje’s recommendation, it seems like the JQuery version of click event handling:

 $('a .icon-delete').click(function(){…}

needs to be replaced by this for the ownCloud JS file:

$(document).on('click','a .icon-delete', function(){…}

Also, per brantje’s pull request, in the routes.php file, the route for delete function was changed from:

array('name' => 'item_api#remove_item', 'url' => '/deleteitem', 'verb' => 'DELETE'),

to

array('name' => 'item_api#remove_item', 'url' => '/deleteitem', 'verb' => 'GET'),


For scraping the meta information from a given URL, brantje recommended to use curl / file get contents to get the page and then search for the meta tag description  http://stackoverflow.com/questions/3711357/get-title-and-meta-tags-of-external-site

Per brantje’s PR, added the logic for scraping meta information in the itemapicontroller.php file like so:

Last week I completed the following items for readlater app thanks to generous help from brantje`:

Delete an item in the readlater app
Scrape meta info from a given URL
Fix the search function
For the delete function, I ran into the issue where the click event for sidebar items is not being caught in the JS. Per brantje’s recommendation, it seems like the JQuery version of click event handling:

$('a .icon-delete').click(function(){…}
needs to be replaced by this for the ownCloud JS file:

$(document).on('click','a .icon-delete', function(){…}
Also, per brantje’s pull request, in the routes.php file, the route for delete function was changed from:

array('name' => 'item_api#remove_item', 'url' => '/deleteitem', 'verb' => 'DELETE'),

to

array('name' => 'item_api#remove_item', 'url' => '/deleteitem', 'verb' => 'GET'),

For scraping the meta information from a given URL, brantje recommended to use curl / file get contents to get the page and then search for the meta tag description http://stackoverflow.com/questions/3711357/get-title-and-meta-tags-of-external-site

Per brantje's PR, added the logic for scraping meta information in the itemapicontroller.php file like so:

$isURL = (bool)parse_url($url);
if($isURL){
$html = $this->file_get_contents_curl($url);
$doc = new \DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');

//get and display what you need:
$title = $nodes->item(0)->nodeValue;

$metas = $doc->getElementsByTagName('meta');

for ($i = 0; $i length; $i++)
{
$meta = $metas->item($i);
if($meta->getAttribute('name') == 'description')
$description = $meta->getAttribute('content');
if($meta->getAttribute('name') == 'keywords')
$keywords = $meta->getAttribute('content');
}

$item = array();
$item['url'] = $url;
$item['title'] = $title;
$item['description'] = ($description) ? $description : '';
$item['keywords'] = ($keywords) ? $keywords :'';

$result['itemid'] = $this->ItemBusinessLayer->create($item);

}


 

Advertisements

Implementing List Feed Feature

Last week I completed the list feed feature for the readlater app.
Currently, on page load, the app shows list of all the feeds that were saved for reading later like so:

List Feed Feature

List Feed Feature

 

I faced couple of glitches while developing this feature:

1. The items were being fetched from the DB but were not getting rendered. Turns out I was not attaching the elements to the UL.

2. Issues with deferred calls. So I wrote the following logic to ensure that the items are attached only after all of them are pushed to the items array.

showData();
 showDataDone();
 $.when( showData(), showDataDone() ).done(function() {
 alert( 'Deferred success' );
 })
 .fail(function() {
 alert( 'Deferred fail' );
 });

3. Issues with using the correct CSS class for rendering the icons. With help from brantje and LEDFan, I was able to add the images for star, edit, and delete from the CSS like so:

.icon {
 padding-left: 25px;
 background-repeat: no-repeat;
 height: 16px ; 
 width: 16px; 
}
.star-icon {
 background-image: url('../img/star.png');
}
.edit-icon {
 background-image: url('../img/rename.png');
}
.delete-icon {
 background-image: url('../img/delete.png');
}

Once I had these classes defined, I could use the icons with any other element as a background image.

Creating Add Feature for ReadLater App

This week I completed the add functionality for the ReadLater App thanks to the generous help from the community especially LEDfan, brantje, and Raydiation.

This blog post provides details on how I built this feature.

I started with understanding how existing apps work on the ownCloud platform. For this feature, I looked at bookmarksPassman, and the News app. To develop the front end of the readlater app, I looked at the existing News app implementation and created this front-end for ReadLater app:

add-feature1

ReadLater App Front-end

For this purpose, I looked at the docs for manipulating CSS and HTML: ownCloud App Development – Front End.

For the front end development, you need to start with the /templates/main.php file in your app.

While developing the front end, you might come across the “Template file not found” exception:

Template file error owncloud app development

 

This exception happens when you specify incorrect route in your appinfo/app.php file. To resolve this error, you must specify correct route like so, for the readlater app, the index page is specified as:

'href' => \OCP\Util::linkToRoute('readlater.page.index'),

I also came across the “no app name ” error. This error is caused because the repo on github is ReadLater and when I cloned it, the directory was called ReadLater, but ownCloud only allows readlater.

I then added the functionality to save the content in readlater app, however, I was not able to insert values into the database. For creating the backend, I looked at Passman’s implementation at: https://github.com/brantje/passman/blob/master/controller/itemapicontroller.php

I also saw this error in the log file (to access the log file, go to http://$your-owncloud-server/data/owncloud.log)

{"app":"PHP","message":"Cannot modify header information - headers already sent at \/var\/www\/core\/lib\/private\/appframework\
/app.php#68","level":3,"time":"2014-07-18T11:14:13+00:00"}

Per discussion with Raydiation on the IRC channel, it was probably because I had removed the annotations from the Controller.

The App Framework also provides a simple baseclass for adding controllers:OCA\AppFramework\Controller\Controller. Controllers connect your view (templates) with your database and contain the logic of your app. Controllers themselves are connected to one or more routes. Controllers go into the controller/ directory.

For security reasons, all security checks for controller methods are turned on by default. To explicitly turn off checks, you must use exemption annotations above the desired method.

Possible Annotations contain:

  • @CSRFExemption: Turns off the check for the CSRF token. Only use this for the index page!
  • @IsAdminExemption: Turns off the check if the user is an admin
  • @IsLoggedInExemption: Turns off the check if the user is logged in
  • @IsSubAdminExemption: Turns off the check if the user is a subadmin
  • @Ajax: Use this for Ajax Requests. It prevents the unneeded rendering of the apps navigation and returns error messages in JSON format

It is important to add your controller to the dependency injection container in dependency injection/dicontainer.php. (source: ownCloud Controllers)

To check if the Add button is working, I used the Network tab of the developer tools in my browser.  For the above error in addition to the annotations, I was also using incorrect route.

Routes are declared in appinfo/routes.php. Routing connects your URLs with your controller methods and allows you to create constant and nice URLs. Its also easy to extract values from the URLs. For more information on how to implement routes in your app, see: ownCloud Routes.

After fixing  the routes, I was still seeing a blank page even after the POST request went through fine.

Routes-Error

 

Turns out, I missed the generateURL function in my JS file.

To send requests to ownCloud the base URL where ownCloud is currently running is needed.  Full URLs can be generated by using:

var authorUrl = OC.generateUrl('/apps/myapp/authors/1');

For the readlater app’s Add url, my code in the js file is now:

function saveData(){
$.ajax({
	type: "POST",
  	url: OC.generateUrl('/apps/readlater/add/url'),
  	data: {url: $('#url').val()}
    }).done(function( msg ) {
 alert( "Your content was saved: " + msg );
    });

}

I also encountered this exception:

{"app":"index","message":"Doctrine\\DBAL\\DBALException: An exception occurred while executing 'INSERT INTO `oc_readlater_items` (`url`)
VALUES (?)':\n\nSQLSTATE[42S02]: Base table or view not found: 1146 Table 
'owncloud.oc_readlater_items' doesn't exist","level":4,"time":"2014-07-19T13:02:52+00:00"}

To fix this, LEDfan suggested adding a file “version” in the appinfo dir:
https://github.com/ruchita20/readlater/blob/master/appinfo/version

Also, it is important to note that whenever you change anything in your appinfo/database.xml file, you must increment the version numbers in both the version file and the appinfo/info.php file.

Thanks to the help from the ownCloud IRC channels, I was able to complete the add URL feature for ReadLater app.

How to commit your apps to ownCloud?

I hit several roadblocks while contributing my code to ownCloud app repository.

For my git issues, I also looked at several stackoverflow posts and github’s documentation listed here:

But many thanks to the ownCloud’s helpful developer community (especially @ledfan and @brantje) for handholding me in resolving these issues.

This blog post provides detailed information about the steps for contributing your application to ownCloud.

Step 1: If you dont have a git account, get a free github account at .

Step 2: Start your own repository.

Step 3: Do not commit anything.

Step 4: From the command line, execute the following command:

git init

This initializes an empty Git repository in your web server folder. In my case, the repository was initialized at:

/var/www/html/core/apps/readlater/.git/

Step 5: Prepare for your initial commit using the following commands:

git add . -A

git commit -m “Initial Commit”

git remote add origin git@github.com:$gitusername/$gitrepository

Step 6: Push your repository using the following command:

git push origin master

The next blog post in this series is about my status update and learnings about the UI for ownCloud’s ReadLater app. I am now reworking the low fidelity mockups to incorporate UI trends like app-navigation based on the issues listed here: ownCloud Design Issues for ReadLater.

Needs Assessment Plan for ReadLater App

Based on my initial discussions with project mentors and after following the discussion threads on ownCloud’s git issue tracker, here is the wishlist of features for the app:

  1. Integrate the API developed by Wallabag (Poche):

(a) Wallabag (formerly Poche) is an open-source self hostable application for saving web pages.

2. Implement a server for ownCloud that works with wallabag’s API and thus uses their existing Android app, browser extensions, and FireFox OS app.

3. Design and implement a front-end for the Read-later items within ownCloud and possibly combine with the Bookmarks app in ownCloud.

Based on the discussions with project mentors, I researched about the usability aspects of various bookmarklet apps and came up with the following list of design changes:

  1. The app should be able to intuitively build dynamic newsfeed based on the chronologically selected tags.

Why we need this: Because once in a while, you do want to clean up your accumulated tags (just like you would spring clean your house) Sometimes, we really need to read the article we have tagged/bookmarked/added to our account, but we forget the tag that we gave to the article, heck we even forget what the title of the article was, all we remember is maybe an image or central theme of the article.

2. If the app can show dynamically create news feed based on past week’s data, these would allow the user to go through the articles without spending considerable time searching for it. (may be Pinterest type thumbnail view of the articles for last week or asking the user to rate the article (whether they really want to revisit it say after 8 days or so). And because we are in spring cleaning mode, it would be nice to have an “archive” option to dump all the clutter.

3. I wanted to understand how users will engage with this app, so I came up with the following use case scenarios:

(a) May be have a ranking of most talked about articles/pages to keep you interested in the discussion

(b) Have an ability to tag any web resource (images, videos, podcasts, pages, etc.)

(c) Add ability to visualize your feed’s influence on your social network (items you share from your feed, etc.)

(d) When I click on the bookmarks that I have saved previously, it would be nice to see a preview of the page, so that I can decide whether to follow that bookmark/readlater tag.

4. I also proposed the following generic enhancements:

(a) It would be nice to have tag merging option in order to facilitate reorganization efforts for the collected pages.

(b) Single browser icon to bookmark a page (right now it is a clunky text “Add to ownCloud”, which takes up a lot of space on the Bookmarks Bar)

(c) Suggest tags based on the page’s content (to avoid tag soup for misspelled tags)

(d) Ability to share bookmarks (public, private)

(e) Ability import bookmarks from other accounts is hidden under settings icon, this gives a rather broken user experience for a first time user because as a first time user of the application, I want to ensure that I am able to bring in all the data from my accounts with a single click. (Right now the user needs to click the Bookmarks app first, then go to the settings icon to reach to the “Import” function)

5. Based on Alessandro’s idea about the app: “Read later app should be able to distinguish between bookmarks and pages that are part of the reading list (may be have separate icons for bookmarks and reading list)”

To enhance this feature, it would be nice to provide flexibility to the user for choosing whether to classify a URL as bookmark or read later.

Resources

For the above usability exercise, I used the following resources:

  1. Usability issues filed for boookmarks feature.
  2. Usability testing for instapaper, pocket, feedafever, and other apps listed in https://github.com/wallabag/wallabag/issues/429
  3. Inferences from the social tagging behavior: Trant, Jennifer. “Studying social tagging and folksonomy: A review and framework.” Journal of Digital Information 10, no. 1 (2009) and UC Berkeley School of Information’s course on Information Organization and Retrieval — INFO 202