Posted by jclayc on the 14th of October, 2007 at 5:10 pm under Coding Topics, General SEO Discussion and SEO for Law Firms.    This post has 2 comments.

An XML sitemap is a simple way to “spoon feed” your page locations and other info (like page importance, update period, etc.) to the search engines. Through the recently-adopted Sitemaps.org XML sitemap standard, one file can serve the top 4 search engines. Most webmasters know about Google Webmaster Central and Yahoo! SiteExplorer, but those XML sitemap submission methods are cumbersome. Here’s the short way to tell search engines about your XML sitemap:

Ask.com:

Google:

Yahoo!:

 

Live.com/MSN.com is still the odd man out. To submit your sitemap to them, (as per Livesearch.spaces.live.com): modify robots.txt to include a line for Sitemap: <sitemap_location> where <sitemap_location>; is the complete URL of your Sitemap Index File (or your sitemap file, if you don’t have an index file).

My advice, use www.xml-sitemaps.com to create your sitemap, upload it to your root directory and use the methods above to submit!

Technorati Tags: ,

Posted by jclayc on the 16th of July, 2007 at 3:03 pm under Coding Topics, General SEO Discussion and SEO for Law Firms.    This post has 2 comments.

You may have read but over the July 4th holiday, Google posted some subtle changes to their algorithm and rankings. I’ve seen a few notes about website age being one of the most noticeable factors in the changes. It’s a long story, but I’ve turned up pretty clear indications that they also rolled another feature into the new algorithm: Google is now indexing project titles and links within Flash SWF files. I’m still testing to see if they’re also spidering text content found inside SWFs but I can say with 99% certainty that they picked up URLs that exist only inside an SWF I previously had online. What do you think?

Technorati Tags: , , ,

Posted by jclayc on the 7th of May, 2007 at 9:33 pm under Coding Topics and General SEO Discussion.    This post has 2 comments.

The other day, I ran into a law firm with a website coded using the IFRAME element. While there is nothing programmatically wrong with IFRAMEs, they present some significant challenges for search engine optimization.

Here is a definition for “IFRAME” from Wikipedia:

“IFRAME (from inline frame) is an HTML element which makes it possible to embed another HTML document inside the main document… IFrames are more commonly used to insert content (for instance an advertisement) from another website into the current page.”

A few examples of sites using an IFRAME are:
http://www.samisite.com/iframephotos/index.htm,
http://calcium.brownbearsw.com/demos/miniframe.html &
http://www.webmonkey.com/webmonkey/96/37/stuff/iframe_ex.html

From a search engine optimization point of view, the use of the IFRAME is problematic for several reasons. First, whenever a search engine spiders the content that’s within an IFRAME, the search engine will normally link to the IFRAMED page itself instead of the “master” page it is housed within. Often, this means searchers are delivered to a page without site navigation. This is not optimal for keeping the attention of search engine spiders or visitors. A good example of a page like this within Google’s index:
http://www.paxilbirthdefect.com/source/about/prenatal_exposure.html.

Next, if you’re using an IFRAME to display another website’s content within your overall navigation structure, you’re now subject to showing whatever that site may change their content to say. Today’s IFRAMED page about birth control may be tomorrow’s IFRAMED page about a less savory topic. Of course, search engines will recognize you’re serving another site’s content so it will have no positive impact on your site’s rankings.

Additionally, some users turn off the IFRAME element in their browser’s advanced settings because of a the security hole it presents (trusted websites can unwittingly serve malicious content via IFRAMES).

Finally, and this is the most important point, I’ve spotted some sites that have multiple IFRAME “shells” on different URLs that all serve the same framed-in content. In the eyes of the search engines, this is duplicate content and is likely to be reason for de-listing.

If you can’t tell, I don’t recommend using IFRAME in your site’s code. Just like framesets, there was a time for them, but that was years ago. Duplicate content, end-user experience and security concerns are factors I consider to be argument enough.

Technorati Tags: ,

Posted by jclayc on the 14th of February, 2007 at 6:37 pm under Coding Topics, General SEO Discussion and Legal Websites.    This post has one comment.

I’ve been messing around with PDF files lately - trying to make sure they’re able to be spidered by (at least) Google. Here’s a crash course in what I found. Using a full version of Acrobat, open the PDF you want to optimize and press CTRL + D. This will open the Document Properties. Within the PDF’s Document Properties, enter in the PDF’s TITLE, Author, Subject and Keywords. Be accurate, be succinct and don’t spam.

The TITLE you set in your PDF Document Properties will show up in Google as the PDF’s link. (Without it, all of your PDFs will be indexed as Untitled.)

What about the rest of the document? Can the search engines read my PDF? Well, the answer is “it depends“. In general, if you open the PDF and can use the text tool to highlight individual lines of copy, it’s going to be indexable by Google. Another way to tell: open the PDF and press CTRL + A to select everything in the document. Then press CTRL + C to copy everything. Go to Notepad and press CTRL + V to paste what you just copied. If real text appears in your open Notepad, it’s searchable.

What about scanned PDFs? If you’re unable to select any text using the Text Tool, it’s likely your PDF is just an image of text — not searchable. What can you do about that? My best advice is to try to use Acrobat’s native OCR feature to convert that image to real, searchable text. Once the OCR has run, it won’t be apparent that anything has happened - that’s because Acrobat keeps that original image “in front of” the converted text. The converted text is now there, but it’s behind the scenes and only readable by “users” like search engine spiders. NOTE: the quality of the OCR is poor. I’ve never had much luck with it. To see what the converted text is, use the CTRL + A trick. This time, it will copy the converted text. When you paste it to Notepad, you’ll be able to see the quality of the results.

To answer the question “Are PDFs Searchable?” the answer has to be… sometimes. Use the tips above to find out if your PDFs can be read as real text. If not, don’t worry, setting the Document Properties will at least let you convey the PDF’s TITLE to the engines.

Technorati Tags: ,

Posted by jclayc on the 6th of February, 2007 at 5:37 pm under Coding Topics, Design Topics, General SEO Discussion and SEO for Law Firms.    This post has one comment.

On some sites I work on, I see the coders have used CSS to format the site’s H1 tags to not show the text but, instead, display an image in the text’s place. This is commonly called the “display:none” trick. The advantage is that it allows you to use an image as a header while, at the same time, keeping the text “version” behind the scenes. Spiders see the text, end-users see the image. Matt Cutts makes it clear Google is watching this “trick”… the biggest thing they’re watching is keyword spamming and multiple instances of the trick.

I’ve recommended a few sites use the trick when a client insists on a crazy font for their content headers but always use real, CSS formatted text in the H1 tag when you can. If you do have to use the display:none trick, make sure what’s inside the H1 tag is pretty well 1 phrase… 3-4 words at most… and reflects the content of the page. No matter what, be careful!

Technorati Tags: , ,

Posted by jclayc on the 24th of September, 2006 at 2:40 pm under Coding Topics, General SEO Discussion and Learning WordPress.    This post has one comment.

There’s a new tool on the block aimed at the booming blogger market - Windows Live Writer (Beta) 1.0. In summary, it’s a desktop application that allows WYSIWYG authoring of posts by programs like WordPress, Blogger, Movable Type and more. In an interesting twist, the program also has a SDK (software developer kit) so modules can be developed for extended functionalities. (For some reason, I get a chuckle out of their homepage statement “We can’t wait to see all the things people cook up with the SDK!”)

So is it worth a try? My immediate opinion is that the workspace looks very similar to Microsoft Word, without so many buttons. Matter of fact, the workspace looks kind of sparse, but with the basics covered. In contrast to w.bloggar, I don’t see any table options. We’ll have to see what I discover as I explore a bit more.

To cover all the bases, there are also the features available for general blog management like “updating weblog style”, trackbacks and other settings. One unique feature is the ability to insert Windows Live Maps. Something I just have to try:

Insertion of the map appears to be successful. With the ability to reference “bird’s eye” views through Microsoft Virtual Earth, the maps look nice, but the accuracy is questionable.

In a sceptical streak, I do hope the program doesn’t fudge up the code like other Microsoft applications tend to do. Time to post this and see!

Thanks to Andy at The lost outpost for the heads up on this program.

Technorati Tags: