# Blosxom Plugin: extensionless
# Author(s): Frank Hecker <hecker@hecker.org>
# Version: 0.4
# Documentation: See the bottom of this file or type: perldoc extensionless

package extensionless;

use strict;

use CGI qw/:standard :netscape/; 


# --- Configurable variables -----

my $hide_category = 0;                  # set to 1 to have an extensionless
                                        # entry URI hide a category of the
                                        # same name as the entry

my $hide_year = 0;                      # set to 1 to have an extensionless
                                        # URI with 4-digit entry name hide an
                                        # archive page for a year

my $debug = 0;                          # set to 1 to print debug messages

# --------------------------------
# __END_CONFIG__

sub start {
    warn "extensionless: \$path_info: '$blosxom::path_info' (before)\n"
        if $debug > 0;

    # Recover the original path passed to Blosxom, in order to properly
    # handle extensionless URIs where the entry name starts with a digit
    # and was parsed by Blosxom as a date reference, not as part of the path.

    my $path_info = path_info() || param('path');
    $path_info =~ s!(^/*)|(/*$)!!g;

    # Check for the following situation:
    #
    #   * the requested URI does not include a file extension
    #   * the requested URI does not *exactly* name a category OR
    #     we want the URI to always resolve to an entry if one exists
    #   * the requested URI does not appear to refer to an archive for a year
    #     OR we want the URI to always resolve to an entry if one exists
    #   * there is an entry corresponding to the URI
    #
    # If the above conditions hold then we update the URI (more specifically,
    # the path component of the URI) to include the flavour's file extension.
    # (The flavour used will be the default flavour, a flavour specified
    # using the flav parameter, or a flavour set by another plugin.)
    #
    # We also set $path_info_yr to be undefined in order to prevent Blosxom
    # and other plugins (e.g., storystate) from considering this an archive
    # page reference in the case where the entry name starts with a digit.

    if ($blosxom::path_info !~ m!\.!
	and (! -d "$blosxom::datadir/$path_info" || $hide_category)
	and ($path_info !~ m!(/19[0-9]{2}$)|(/2[0-9]{3}$)! || $hide_year)
	and (-r "$blosxom::datadir/$path_info.$blosxom::file_extension")) {
	$blosxom::path_info = "$path_info.$blosxom::flavour";
	$blosxom::path_info_yr = undef;
    }

    warn "extensionless: \$path_info: '$blosxom::path_info' (after)\n"
	if $debug > 0;

    1;
}

1;

__END__

=head1 NAME

Blosxom plugin: extensionless

=head1 SYNOPSIS

Make Blosxom recognize extensionless URIs corresponding to individual
entries. See http://www.w3.org/Provider/Style/URI for motivation.

=head1 VERSION

0.4

=head1 AUTHOR

Frank Hecker <hecker@hecker.org>, http://www.hecker.org/

This plugin is now maintained by the Blosxom Sourceforge Team,
<blosxom-devel@lists.sourceforge.net>.

=head1 DESCRIPTION

This plugin allows Blosxom to recognize extensionless URIs for
entries, as recommended by http://www.w3.org/Provider/Style/URI and
elsewhere.  In other words, if you are requesting a page for an
individual entry then you can omit the file extension that would
normally be required for specifying the flavour, unless the extension
is required to disambiguate the URI in the case where the entry has
the same name as a Blosxom category or looks like a 4-digit year
reference.

For example, if you have an entry C<foo/bar.txt> in your Blosxom data
directory then by using this plugin you can use a URI like

  http://www.example.com/blog/foo/bar

to request the entry rather than a URI like

  http://www.example.com/blog/foo/bar.html

unless there is an existing category C<foo/bar> (i.e., there is a
subdirectory of that name in the Blosxom data directory). If there is
such a name conflict then the plugin will cause Blosxom to display the
category and not the individual entry.

(You can change this behavior using a configurable variable as
described below. The plugin's default behavior matches the behavior of
Apache when using the MultiViews option, where Apache will display the
index for an existing directory C<foo/bar> in preference to displaying
a file C<foo/bar.html>.)

You can also use extensionless URIs like

  http://www.example.com/blog/foo/1st-post

where the entry name starts with a digit, as long as the entry name
doesn't look like a 4-digit year (e.g., "2004") used as part of a
request for an archive page for that year. (Again, you can set a
configurable variable to override the default behavior and have entry
names that look like year references.)

When an entry is requested using an extensionless URI the existing
value of the Blosxom C<$flavour> variable will determine the flavour
used for the entry. Normally this value will be either that specified
by the C<flav> parameter (if it is present) or the default flavour
configured into Blosxom; however another plugin could override this
value if desired. For example, a separate plugin could do content
negotiation (e.g., using the HTTP Accept header) to determine what
flavour should be served for an entry. (However see the discussion
below regarding plugin execution order.)

Thus assuming that C<foo/bar.txt> exists in the Blosxom data directory
(and does not conflict with a category C<foo/bar>) and the default
flavour is "html" then the following URIs are all equivalent:

  http://www.example.com/blog/foo/bar
  http://www.example.com/blog/foo/bar.html
  http://www.example.com/blog/foo/bar?flav=html

=head1 INSTALLATION AND CONFIGURATION

Copy this plugin into your Blosxom plugin directory. You do not
normally need to rename the plugin; however see the discussion below
regarding plugin execution order.

Change the value of the configurable variable C<$hide_category> to 1
if you want an extensionless URI to display an entry instead of a
category of the same name. As described above, the default behavior is
that in the event of a name conflict the category is displayed in
preference to the entry, so that displaying the entry in that case
requires explicitly adding a file extension to the URI. If you change
the plugin's behavior from the default then the category can be
displayed only by using a URI explicitly referencing an index page
(e.g., C<.../foo/index.html>) or by referencing an abbreviation for
the category.

Change the value of the configurable variable C<$hide_year> to 1 if
you want an extensionless URI to display an entry in the case where
the URI otherwise appears to be a reference to an archive page for a
given year, for example, with the URI

  http://www.example.com/blog/elections/2004

where an entry C<2004.txt> exists under the category C<elections>. As
described above, the default behavior is that in the event of a
conflict the archive page is displayed in preference to the entry, so
that displaying the entry in that case requires explicitly adding a
file extension to the URI. If you change the plugin's behavior from
the default then the archive page for that year can be displayed only
by usng a URI explicitly referencing an index page (e.g.,
C<.../elections/2004/index.html>).

You can also change the value of the variable C<$debug> to 1 if you
need to debug the plugin's operation; the plugin will print to the web
server's error log the original path component of the URI and the
value after any modification is done.

This plugin supplies only a start subroutine and can normally coexist
with other plugins with start subroutines; however if the other
plugins' start subroutines depend on the standard interpretation of
the Blosxom variable C<$path_info> (i.e., that requests for individual
entries have a period '.' in C<$path_info>) then you should rename
this plugin (e.g., to "00extensionless") to ensure that it is loaded
first and can modify C<$path_info> to match the standard
interpretation before other plugins reference its value.

Also, if you have a plugin that overrides the value of the Blosxom
C<$flavour> variable then you should ensure that that plugin's code
runs prior to this plugin's start subroutine being executed.

=head1 ACKNOWLEDGEMENTS

This plugin was inspired by the cooluri plugin by Rob Hague
http://www.blosxom.com/plugins/link/cooluri.htm and the cooluri2
plugin by Mark Ivey http://www.blosxom.com/plugins/link/cooluri2.htm
but has much less ambitious goals; the code is correspondingly much
simpler.

Thanks also go to Stu MacKenzie, who contributed a patch to fix a
major bug preventing the use of extensionless URIs having entry names
beginning with digits.

=head1 SEE ALSO

Blosxom Home/Docs/Licensing: http://blosxom.sourceforge.net/

Blosxom Plugin Docs: http://blosxom.sourceforge.net/documentation/users/plugins.html

=head1 BUGS

In order to check whether a file extension was originally included in
the URI we check the value of C<$blosxom::path_info>. As noted in a
comment in the code, this value is not the actual URI path in the case
where the URI contains a component starting with a digit; however
checking C<$blosxom::path_info> works for the URIs of interest to us
and (unlike checking our local C<$path_info> value) does not produce
potentially spurious results on invalid Blosxom URIs like

  http://www.example.com/blog/foo/123bar/baz.html

(Unlike entry names, Blosxom category names can never start with a
digit, and allowing them to do so is beyond the scope of this
plugin. For this example URI Blosxom would have set C<$path_info> to
C<foo> and attempted to parse C<123bar> and C<baz.html> as
C<$path_info_yr> and C<$path_info_mo> respectively.)

Using an entry name that looks like a year reference is problematic
because of potential ambiguity regarding what actually is a year
reference and what is not. The code as presently written takes a
relatively (but not completely) conservative approach: values from
1900 through 2999 are considered to be year references, and by default
will hide references to entries of the same name (sans extension).

Also note that versions of this plugin prior to 0.4 are not
compatible with Blosxom versions 2.0.1 and higher; in prior versions
this plugin executed its code in the filter subroutine, and in Blosxom
2.0.1 the execution of plugins' filter subroutines was moved in a way
that broke this plugin. This problem was fixed in version 0.4 of this
plugin by moving processing to the start subroutine.

Finally note that this plugin assumes that you are storing Blosxom
entries in the standard manner, i.e., as text files in the file system
within the Blosxom data directory. It will not work if you are using
other plugins like svn-backend that use different storage mechanisms.

Please send other bug reports and feedback to the Blosxom development 
mailing list <blosxom-devel@lists.sourceforge.net>.

=head1 LICENSE

extensionless Blosxom plugin Copyright 2004-2006 Frank Hecker

Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
