security +

August 12, 2008

so…. I took the Security + exam last week and passed with an 885/900… which means I am

sweeeet!

security +

July 23, 2008

the past couple weeks have been busy… but not with Google. the GSA is basically doing its thing – set it and forget it i guess you could say.

i’ve been trying to get ready for the security + exam that i’ll be taking in two weeks. im just a LITTLE paranoid about bombing the test so ive been studying my arse off.

secure search

June 26, 2008

So basically, a secure search works like this…

Not to state the obvious but to start things off – a user enters a search query.

The appliance determines the relevant URLs and then determines whether the user has access to the content. The appliance then makes an authorization request to the appropriate webserver and stores the authorization data. The appliance caches that info for subsequent searches, making those searches faster.

The appliance will wait a maximum of 5 seconds on the content server to respond to the authorization (HEAD requests), if it gets no response it will move on.

Too many of these timeouts could result in poor results, or no results at all.

If your content is behind a proxy server/reverse proxy, etc. that could potentially slow down the process and cause timeouts.

Just had this happen – thought I’d make a note about what you have to do.

I made a change to a stylesheet – was able to preview the changes made through the admin console and everything looked good…. so I saved those changes. Went to the web to see the page and the new changes, but they weren’t showing up.

If this happens – you need to refresh the cache on the appliance.

To do this, add this parameter to the end of your url string: &proxyreload=1

Installing an SSL cert

June 19, 2008

this should be simple… then again – nothing is ever as simple as it should be.

First you need a certificate authority file from the vendor of your certificate – You need a CA file for each server you want to access through https.

To upload these CA files, go the the Administration Section of the admin console > Certificate Authorities > and browse to the file. Also, these files must be in the supported .pem format, if not they will need to be decrypted… use openssl to do this.

So…. all of the CA files are in place on the Google box.

Next, go to the SSL Settings section > Install an SSL Certificate, browse to find your cert and private key (in .pem format)

Click the View Certificate Info button – and this should display the new cert info.

If you’re ready to install the cert, click the ‘Install SSL Certificate’ button – how simple is that. (By the way, this will restart the appliance.) If all that works – you’ll log in again – test the appliance by running some queries for secure content, if you don’t get any prompts to enter credentials, etc then the cert should be good.

Let’s take a step back though… what if after you browse to find your cert and .pem file, click view cert info, the page refreshes, but nothing has changed… well – that is a good question. First thing to check is, are your certificate authorities installed? In my case – Yes.

So what next – the cert file seems to be ok, the private key seems to be ok, it’s in the proper format.

What next? Good question – still working on that.

Nothing is ever as simple as it should be.

I talked a little about front ends in my previous post…

Creating a front end in the admin console is pretty straight forward. Customizing that front end – extensive customization anyway requires you to modify the xslt.

What if you just want to put a search box on an existing web page, and have that search box point to your search appliance?

That can be accomplished by adding this bit of code:

<form action=’https://mysitesurl/search’ method=’get’ >
<input type=’hidden’ name=’collection’ value=’default_collection’>

That should do it.

For more reference info, check out this page.

Basically it works like this…

Once the GSA is installed on your network and configured – you can begin to crawl documents, creating a master index of files. Your users can immediately begin to search over this index from the search interface.

You can create custom search pages (front ends), or use the default front end. You can create multiple front ends… a search interface for each department or sector of your business for example. You can create as many front ends as you like, and you can modify certain attributes of those front ends from the admin console. If you need to make extensive changes, or create custom pages – you have to modify the xslt code behind the front end… I’ll save that for another blog.

The search appliance can literally index millions of files; that number varies depending on the license you have and the model of GSA that you buy.

The GB-1001 is a 2U server that can be licensed to search up to 3 million documents.

GB-1001

The GB-5005 is a 5U server with a document capacity of 10 million documents – it features built in clustering and failover.

GB-5005

Next up is the GB-8008

GB-8008

This monster can index up to 30 million documents out of the box and apparently has the ability to index an unlimited number of documents. Better have a fat wallet.

Looked like everything was setup as it should be from the admin console – index was complete, collection was pointing where it should be – but the only thing we get in the form of results is a message saying ‘no documents matched your query’. What the FAQ!

Hello – plenty of documents match the query… I see them in the index!

Turns out – the timeout period on the GSA is 5 seconds (max) for an individual request. With the GSA being behind a proxy (reverse proxy) – things can get tricky, or in this case ‘no documents matched your query’. Response times in the 15-10 second range aren’t going to cut it; the GSA is waiting 5 seconds on a response – it doesn’t get a response from the content server so it says goodbye, have a nice day. Then it tells the user ‘no documents matched your query’.

So – what now?

Well – the maximum amount of time the GSA will wait on an individual request is 5 seconds, and 5 seconds isn’t cutting it.

Is there anything within the admin console to help resolve this? Short answer… No.

Can the proxy be tweaked somehow?. Probably. In this case however, the answer was removing the machines from behind the proxy.

I did.

I went to Figleaf’s fundamentals course in DC. It covers everything from installing the GSA, to modifying the XSLT in your stylesheets. It’s hands on type of course too – not just a Google guru talking for 8 hours straight… granted there is a lot of talking, but you do get to work on your own GSA – testing the various things you’re being taught. This course will cost you about $1800 – that covers 3 full days of training.

If your company wont spring for that… I’d start with Google’s documentation and forum.

I can’t speak for the other training providers you might find – the options are fairly limited.

well – so do I.

so I figured why not blog about it.

To get the easy things out of the way – what the heck is a GSA?

The Google Search Appliance is a cool new ‘appliance’… ok – really it’s a Dell Poweredge 2950 fitted with a yellow case that has few holes in it – swiss cheese style:

swiss cheese.

More info to come…. like what it does besides look pretty cool.