Webmaster Academy - Webmaster Tools Help
0 of 19 items completed (0%)
Welcome to Webmaster Academy! Our goal is to help you create great sites that perform well in Google search results.
Beginners guide to using Google and its tools
0 of 19 items completed (0%)
Welcome to Webmaster Academy! Our goal is to help you create great sites that perform well in Google search results.
Beginners guide to using Google and its tools
Today I went on the hunt for a GREP-like tool that would allow me to search through a bunch of source code files and pull out all the web service calls. I needed the tool to allow for regular expression searches and ideally it would also export out the matches. PowerGREP looked like the tool of choice, but costs $159. I tried a few free alternatives including Agent Ransack, AstroGrep, grepWin, and Windows Grep. Agent Ransack was looking really good because it could export the results. AstroGrep was nice too because it had a clean simple interface. However, I ran into the same problem with all of these programs: they don’t return matches that span more than one line (multi-line). That was a deal breaker for me.
Eventually I came across dnGREP which is an open source GREP tool built with .NET. The interface is really nice and the best feature of all: multi-line support! It also has a nice little window that lets you test out your regular expression against some text that you provide. Sadly it doesn’t provide a way to export the results, but it does have a lot of features considering it is free.
Posted in jQuery, ColdFusion | Posted on 04-06-2012 | 2,631 views
Earlier this week James Moberg introduced me to a cool little Java utility - jsoup. jsoup provides jQuery-like HTML manipulation to your server. Given a string, or a URL, you can do things like, find all the images, look for links to a PDF, and so on. Basically - jQuery for the server. I thought I'd whip up a quick ColdFusion-based demo of this so I could see how well it works.
I began by downloading the jar file and dropping into a folder called jars. Then, using ColdFusion 10, it was trivial to make it available to my code:
component {this.name = "jsoup_demo";this.javaSettings = {loadPaths=[expandPath("./jars")]};}component {this.name = "jsoup_demo";this.javaSettings = {loadPaths=[expandPath("./jars")]};}I then whipped up a demo that loaded (and cached) CNN's html. I create an instance of jsoup, parse the HTML, and then run a "select" using my selector, in this case, just 'img':
<!--- cache it to speed it up ---><cfif not cacheIdExists("cnnhtml")><cfhttp url="http://www.cnn.com"><cfset cnnhtml = cfhttp.filecontent><cfset cachePut("cnnhtml",cnnhtml)><cfelse><cfset cnnhtml = cacheGet("cnnhtml")></cfif><cfscript>jsoup = createObject("java", "org.jsoup.Jsoup");doc = jsoup.parse(cnnhtml);links = doc.select("img");</cfscript><cfloop index="e" array="#links#"><cfoutput>#e.attr("src")# --- Title: #e.attr("title")# --- Alt: #e.attr("alt")#<br/></cfoutput></cfloop><!--- cache it to speed it up ---><cfif not cacheIdExists("cnnhtml")><cfhttp url="http://www.cnn.com"><cfset cnnhtml = cfhttp.filecontent><cfset cachePut("cnnhtml",cnnhtml)><cfelse><cfset cnnhtml = cacheGet("cnnhtml")></cfif><cfscript>jsoup = createObject("java", "org.jsoup.Jsoup");doc = jsoup.parse(cnnhtml);links = doc.select("img");</cfscript><cfloop index="e" array="#links#"><cfoutput>#e.attr("src")# --- Title: #e.attr("title")# --- Alt: #e.attr("alt")#<br/></cfoutput></cfloop>Notice how I can loop over the matches and grab attributes from each one. Again, very jQuery-like. I wanted to play with this a bit more free form so I created an application that lets me supply any URL and any selector. Here's that code - minus the UI cruft around it:
<cfparam name="form.url" default=""><cfparam name="form.selector" default=""><form class="well" method="post"><cfoutput><p><label for="url">URL:</label> <input type="url" name="url" id="url" required value="#form.url#"></p><p><label for="selector">Selector:</label> <input type="text" name="selector" id="selector" required value="#form.selector#"></p><p><input type="submit" value="Run Test" class="btn btn-primary"></p></cfoutput></form><cfif isValid("url",form.url) and len(trim(form.selector))><!--- cache it to speed it up ---><cfif not cacheIdExists(form.url)><cfhttp url="#form.url#"><cfset html = cfhttp.filecontent><cfset cachePut(form.url,html)><cfelse><cfset html = cacheGet(form.url)></cfif><cfset jsoup = createObject("java", "org.jsoup.Jsoup")><cfset doc = jsoup.parse(html)><cfset elements = doc.select(form.selector)><table class="table table-striped table-bordered"><cfloop index="e" array="#elements#"><cfoutput><tr><td>#htmlEditFormat(e.toString())#</td></tr></cfoutput></cfloop></table></cfif>
<cfparam name="form.url" default=""><cfparam name="form.selector" default=""><form class="well" method="post"><cfoutput><p><label for="url">URL:</label> <input type="url" name="url" id="url" required value="#form.url#"></p><p><label for="selector">Selector:</label> <input type="text" name="selector" id="selector" required value="#form.selector#"></p><p><input type="submit" value="Run Test" class="btn btn-primary"></p></cfoutput></form><cfif isValid("url",form.url) and len(trim(form.selector))><!--- cache it to speed it up ---><cfif not cacheIdExists(form.url)><cfhttp url="#form.url#"><cfset html = cfhttp.filecontent><cfset cachePut(form.url,html)><cfelse><cfset html = cacheGet(form.url)></cfif><cfset jsoup = createObject("java", "org.jsoup.Jsoup")><cfset doc = jsoup.parse(html)><cfset elements = doc.select(form.selector)><table class="table table-striped table-bordered"><cfloop index="e" array="#elements#"><cfoutput><tr><td>#htmlEditFormat(e.toString())#</td></tr></cfoutput></cfloop></table></cfif>You can run this yourself by hitting the demo below. All in all - a very interesting Java library. Sure you could do all of this with regular expressions, but I find this syntax a heck of a lot more friendly. (And that's with me having used regex for the past 15 years.)
Talk about synchronicity - within 10 minutes of each other, both Ben Nadel and I posted on the same topic! Parsing, Traversing, And Mutating HTML With ColdFusion And jSoup
Comments
[Add Comment] [Subscribe to Comments]
MikeZ said on 04-06-2012 at 8:33 AM #![]()
Quite interesting. I often need to parse data from various HTML files and this would actually make it a lot more comfortable.But does it require valid XML or is it able to handle formatting errors up to a certain degree?
Raymond Camden said on 04-06-2012 at 8:34 AM #![]()
According to their site, they specifically support messy HTML. Try a site that you know is messy with my tester app.
MikeZ said on 04-06-2012 at 8:53 AM #![]()
Thanks, I'll give it a try.
John Lang said on 04-06-2012 at 8:55 AM #![]()
This is awesome, thanks for posting it Ray!
James Moberg said on 04-06-2012 at 10:45 AM #![]()
Using CFDUMP when reviewing what jsoup returns is extremely beneficial.I used jsoup's whitelist to automatically remove invalid/bloated Microsoft markup from HTML in an RSS feed.
http://pastebin.com/rvkt2GCC
Raymond Camden said on 04-06-2012 at 10:52 AM #![]()
Yeah - using cfdump on Java objects is a quick/dirty way to see the API. Of course, jsoup does have their JavaDocs online too. :)
MikeZ said on 04-06-2012 at 11:03 AM #![]()
Thanks James, I almost forgot about the most obvious use for this thing. Recently encountered a case where I needed to remove markup junk from copy&pasted Outlook mail content.
Rob Brooks-Bilson said on 04-09-2012 at 6:24 PM #![]()
Ray,I noticed you used the new ColdFusion 10 cacheIDExists() function in your code. I wanted to point out that I'm not a fan of the function for the most part as it actually adds an additional cache lookup in the event of a cache hit the way the documentation shows how to use it:
1st call - does it exist?
2nd call - it exists, so call the cache and get it (the 2nd lookup) or it doesn't exist, so get it then put it in the cacheNow consider this code:
<cfset cnnhtml = cacheGet("cnnhtml")>
<cfif isNull(cnnhtml)>
<cfhttp url="http://www.cnn.com">;
<cfset cnnhtml = cfhttp.filecontent>
<cfset cachePut("cnnhtml",cnnhtml)>
</cfif>Notice that in the case of a get, if the item is in the cache, that's it - just one call. If the item doesn't exist then you grab it and put it in the cache. This might seem like a small deal, but in a large scale system, it could be significant.
Raymond Camden said on 04-10-2012 at 10:27 AM #![]()
Are you saying ehcache doesn't provide a nicer way of checking for something than asking for it and noticing it is null? Or that CF doesn't have a way of using a nicer API? Is the check that expensive?
Rob Brooks-Bilson said on 04-10-2012 at 10:45 AM #![]()
I'm saying that using cacheIDExists() is less efficient. It's not expensive for small apps. It is expensive at scale.
Raymond Camden said on 04-10-2012 at 10:47 AM #![]()
Is this a CF wrapper issue or just a fact of life with ehcache?
OK, in our last two posts we built a basic grid and populated it with data. As you can see, so far it's a very basic grid.
![]()
Nothing much to it, really. Let's start adding a few important pieces. jqGrid includes a large set of Column Model Options, but there really aren't a ton that you'll need. Here we'll run through some basics.
First, the id field really isn't something you typically need to show anyone. So, just hide it.
jqGridDemo.js - Column Model - Hide
1colModel: [
2 {name: 'ID', hidden: true},
3 ...
4],Oh yeah, and the views column is a count of the number of page views. As a number, it should probably be right justified.
jqGridDemo.js - Column Model - Align
1colModel: [
2 ...
3 {name: 'VIEWS', align: 'right'}
4],Great! But that column is way too wide! Without any additional info, jqGrid will attempt to size the columns according to their data, and currently it's just making three even columns. Let's size it down.
jqGridDemo.js - Column Model - Width
1colModel: [
2 ...,
3 {name: 'VIEWS', align: 'right', width: 60, fixed: true}
4],Alright! We've set a 'fixed' width, so that any resizing of the grid (even automatic resizing) will maintain the set column width. We set it to give full width of the column title, as well as some room for the sort markers when the column is being sorted.
Now let's talk about three options that can be somewhat confusing: index, label and name. Up until now we've used the name option, which has mirrored the column name being returned. However, we might want our column header to be different than the actual column name. For this, we use the label option.
jqGridDemo.js - Column Model - Label
1colModel: [
2 ...,
3 {name: 'POSTED', label: 'Release Date'}
4],This changed the label used in the column header, while maintaining a reference used when sorting the grid by the posted field. This is good, until you do something like this:
jqGridDemo.js - Column Model - Remap
1grid.jqGrid('setGridParam',{remapColumns:[
2 gridCols['ID'] + gridMultiSelect,
3 gridCols['ID'] + gridMultiSelect,
4 gridCols['TITLE'] + gridMultiSelect,
5 gridCols['POSTED'] + gridMultiSelect,
6 gridCols['VIEWS'] + gridMultiSelect
7]});jqGridDemo.js - Column Model - Index
1colModel: [
2 {name: 'ID', hidden: true},
3 {name: 'Action', index: 'ID', label: 'Action', width: 80, fixed: true, sortable: false, align: 'center'},
4 {name: 'Title'},
5 {name: 'Posted', label: 'Release Date'},
6 {name: 'Views', align: 'right', width: 60, fixed: true}
7],"Cutter, What are you doin' ta me!?!" Yeah, now it's confusing. I've added a column. A column that also references the ID field in the return dataset. In this instance the index really isn't truly necessary, but I'll try to explain it for you anyway. Up until now, jqGrid has used the name option as the value that is passed back to the server on a sort request. Here's the thing though: each column has to have a unique reference. That's what the name option is for; being a unique column reference within jqGrid. So, if you have two columns whose underlying data is the same (as it shows you in our new Remap config), then you need a unique reference for jqGrid (the name), and the index field reference that jqGrid will send back to the server on sort requests (again, with the sortable: false I've thrown in here, it's really moot for us). So, to recap:
- name - A unique column reference used by jqGrid
- index - A data field reference used in sort requests. If not present then the name is used.
- label - If present it will override the name option, for what to display in the column header.
You probably noticed that I added a little something to our column remap code.
1gridCols['ID'] + gridMultiSelect,This goes along with a new variable I added to our global variable declarations at the very top of our script.
1var gridCols = {set:false},
2 gridMultiSelect = 0;I'll probably not use that on this round, but it will become important, so I'll leave it for now.
Column Formatting
Now that we've talked about some of the more important column options, let's get into column formatting. Now that we've added some configuration you'll notice a new Action column. Right now, if you ran your template, you'd see a truncated ID value in the cells. We'll need that ID in our output, but the Action column we're building will be used to display action icons (edit, delete, etc). jqGrid has functions for doing this, if you're using it's edit packages, but my app has custom editors for a lot of this data, so we'll apply a custom column formatter to show these action icons.
jqGrid provides predefined formatters for many things, but you can also create your own custom formatters to create your own cell templates. A custom formatter is just a function, applied through the column model, that returns the string to be displayed in the cell. Your function will take three arguments, cellvalue, options, and rowObject. The cellvalue is the value of the data that jqGrid is trying to apply to the cell (in our case, a record's ID). The options is an object containing the rowId and the colModel of the record being applied. The rowObject is the data for the entire row of the record being applied.
jqGrid provides the ability to apply these functions as extensions of it's built in formatter package. Let's write a basic actionFormatter function that returns just the first two characters of the ID field, to get started.
jqGridDemo.js - actionFormatter - figure 1
1$.extend($.fn.fmatter, {
2 actionFormatter: function(cellvalue, options, rowObject) {
3 return cellvalue.substr(0,2);
4 }
5});jqGridDemo.js - Column Model - Custom Formatter
1colModel: [
2 ...
3 {name: 'Action', index: 'ID', label: 'Action', width: 80, fixed: true, sortable: false, align: 'center', formatter: 'actionFormatter'},
4 ...
5],That was easy! You see now how we get to value being applied to the cell. Now let's really change it up, by applying the custom output we discussed before. First, we need the style references to the icons we're going to use.
jqGridDemo.css - icons
/* delete icon image for trigger */
.delete { background: url('/resources/images/icons/delete.png') no-repeat scroll 0px 0px transparent !important; }/* pencil icon image for trigger */
.pencil { background: url('/resources/images/icons/pencil.png') no-repeat scroll 0px 0px transparent !important; } 1/* Basic layout of all trigger icons */
2.icon-trigger { margin: 2px; vertical-align: middle; display: inline-block; width: 16px; height: 16px; }
3.action-trigger { cursor: pointer; }
4.disabled-trigger {opacity:0.4;filter:alpha(opacity=40)!important;}
5
6/* delete icon image for trigger */
7.delete { background: url('/resources/images/icons/delete.png') no-repeat scroll 0px 0px transparent !important; }
8
9/* pencil icon image for trigger */
10.pencil { background: url('/resources/images/icons/pencil.png') no-repeat scroll 0px 0px transparent !important; }For our demo, I'm using the highly useful FamFamFam Silk icon library. Here I've defined some classes for the display of icon 'triggers', or icons that are used as buttons for actions. Next, we'll adjust our actionFormatter to apply the proper output.
jqGridDemo.js - actionFormatter - figure 2
1$.extend($.fn.fmatter, {
2 actionFormatter: function(cellvalue, options, rowObject) {
3 var retVal = "<span class=\'icon-trigger action-trigger pencil\' rel=\'" + cellvalue + "\' \/>";
4 retVal += "<span class=\'icon-trigger action-trigger delete\' rel=\'" + cellvalue + "\' \/>";
5 return retVal;
6 }
7});As you can see, now when you re-run your template you have a nice, formatted Action column, with action icons for 'edit' and 'delete'.
![]()
So, in this post we covered some of the more important Column Model display options, as well as creating a custom column formatter. In our next entry we'll tie some functions to our 'action icons', and talk about row selection options. You can find sample code attached in the Download link at the bottom of the page.
This entry was posted on December 28, 2011 at 4:10 PM and has received 517 views. There are currently 0 comments. Print this entry. Download attachment.
Series of demos of jqGrid
Importing data from a spreadsheet to a database table using ColdFusion
Friday 23 December 2011
I'm currently working on a project where I need to import a list of products from a spreadsheet into a database table. Here is the ColdFusion script I have writted to complete the task.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152cfspreadsheetaction="read"src="#ExpandPath( './data.xlsx' )#"query="importdata"headerrow="1"/>cfsetfailedimports =""/>cfloopquery="importdata"startrow="2">cfif!Len( product_title ) || !Len( product_code ) || !IsNumeric( product_price )>cfsetfailedimports =ListAppend( failedimports, product_code ) />cfelse>cftry>cfquerydatasource="mydatasource"result="foobar">REPLACEINTO Products(product_title, product_code, product_price)VALUES(cfqueryparamcfsqltype="cf_sql_varchar"value="#product_title#"/>,cfqueryparamcfsqltype="cf_sql_varchar"value="#product_code#"/>,cfqueryparamcfsqltype="cf_sql_double"value="#Val( product_price )#"/>)cfquery>cfcatchtype="any">cfsetfailedimports =ListAppend( failedimports, product_code) />cfcatch>cftry>cfif>cfloop>cfoutput>Failed Imports
cfifListLen( failedimports )>Oops! #
ListLen( failedimports )# products could not be imported.cflooplist="#failedimports#"index="index">#index#cfloop>cfelse>No products failed to be imported.
cfif>cfoutput>
I update my host file a lot due to many VPN connections and locations I commute to.
Normally I would have to open notepad as an administrator, hope the etc path is still remembered, if not traverse those multiple directories, change the file types to any, open hosts, modify, save and close.
But I finally thought of a way to same me some time by creating a simple shortcut on my desktop.
- Create a new shortcut
- Update the Target property to “%windir%\system32\notepad.exe C:\Windows\System32\drivers\etc\hosts”
- Rename the shortcut to something like “Edit Hosts File”
Now each time just run the shortcut as the administrator and no having to track down the hosts file anymore.
Quick Grails Logging Tip
Posted: November 28th, 2011 | Author: Christopher Vigliotti | Filed under: Grails | No Comments »My fun with Grails continues. Today I was having a bit of difficulty wrapping my head around the concept of dumping and logging. Dumping variables to the screen or to a log file is easy in ColdFusion thanks to the CFDUMP and CFLOG tags.
One way to dump variables in Grails is to output them to the console. After reading through this and asking a co-worker for assistance I still had to tinker a bit before I could get log.trace() statements from displaying in the console.
So as it turns out, I had to modify the log output level for an entity in Grails modify the log4j code block in your application’s Config.groovy file.
trace 'grails.app.controller.chapter4.SongController'In this example “grails.app.controller” refers to ‘one or more controllers’ and ‘chapter4.SongController’ is the classpath and controller name. The code basically says ‘display all log messages at or above the trace level in SongController’. You can read more about logging levels here.
Here is a slight modification to the above code. This code allowed me to change the log settings for multiple controllers.
trace 'grails.app.controller.chapter4.SongController', 'grails.app.controller.chapter4.AlbumController'Note that you can apply the same log settings to all controller files…
Example C: trace 'grails.app.controller'And for reference here is SongController.groovy…
package chapter4 class SongController { def index = { log.trace("I can see log.trace") log.warn("I can see log.warn") log.error("I can see log.error") } }OK, back to the code. I’m deep into Chapter 4 of The Definitive Guide to Grails.
Simple intro to logging with grails
Today I will walk through how to put into practice use the Tuckey URL Rewrite java web filter under an Apache Tomcat web server.
URL rewriting is the method of converting complex URL parameters into more human readable format to allow more simple and memorable URLs. This can be an important function if you start using frameworks or content management systems which automatically generate long and at times cryptic URLs. While URL rewrite on the more popular Apache HTTP Server is relatively easy to set up using the default mod_rewrite module, reproducing this functionality on Tomcat requires a little more work.
Url Rewrite tool for java application servers e.g. tomcat. Dont have to go via apache or iis